cat articles/noteable

Using ChatGPT's Noteable and WebPilot plugins to build a notebook that predicts OpenCALM 14B performance

I saw people saying that the Noteable plugin, which lets you create notebooks through conversation on ChatGPT, was impressive. I tried it with ChatGPT, or GPT-4, and it was better than I expected. We have reached a convenient time where AI can write the notebook for small analyses, and you can check the results in ChatGPT without opening the notebook itself.

For example, I asked it to fetch the number of parameters and PPL for OpenCALM models from the Hugging Face page and plot them. The result looked like this:

The PPL performance graph per parameter looked like this:

It also built a model using the parameters from 1B onward, where a linear regression seemed plausible. The predicted graph for a hypothetical OpenCALM-14B model looked like this. Since the 1B, 3B, and 7B models decrease linearly, it feels like performance will continue improving as larger models appear.

先ほど作成した線形回帰モデルの切片と係数は以下の通りです:

切片(Intercept): 10.7928
係数(Coefficient): -0.000383
これは、モデルが以下の形式で表されることを意味します:

Dev ppl = 10.7928 - 0.000383 * Params

つまり、Paramsが1増えると、Dev pplは約0.000383減少すると予測されます。

The ChatGPT conversation looked like this. It is not written in this exchange, but if I ask it to model the data with something nonlinear, such as SVM, a neural network, or a polynomial instead of only linear regression, it will express it that way too. It seems very convenient.

The notebook created by Noteable looked like this. The data scraped by WebPilot is placed in the first cell.

Until now, the natural flow was to collect data from a web page, shape it into CSV or Python code, and then analyze it in a notebook using familiar steps. Being able to do that quickly by writing natural language is extremely convenient. If I want to do something more complex, the notebook already exists, so I can continue the analysis by adding a little myself. Having the usual notebook workflow become "mostly let AI do it, then have a human make the final adjustments" is a strong point.

Tedious work keeps disappearing, which feels good.

cat related_articles/noteable.yaml

  1. Analyzing the Iris dataset with ChatGPT's Noteable pluginAfter trying Noteable on a tiny OpenCALM dataset, I asked it to analyze the classic Iris dataset. It quickly generated plots, model comparisons, clustering, and dimensionality reduction notebooks.
  2. Enjoying Stable Diffusion again from a technical perspectiveAfter using Stable Diffusion again through stable-diffusion-webui, I wrote notes on the surrounding techniques I had not followed closely: ControlNet, LoRA, textual inversion embeddings, and checkpoint merging.
  3. Reading Basic Statistics by Kimio Miyakawa: statistics before machine learningAfter several months of studying machine learning, I realized I was missing the statistical foundations needed to understand data, experiments, estimation, testing, and model evaluation.