Advanced Data Analysis, a ChatGPT plugin developed by OpenAI, performs tasks like data analysis by running computer code in response to prompts given in plain language. Organizations can use it for a variety of purposes, such as generating graphs for marketing reports or analyzing financial data.
As a reporter covering artificial intelligence in the workplace, I’m always on the lookout for new ways to use the technology for my own work. So my interest was piqued when I learned in a conversation with Rebecca Hinds, head of The Work Innovation Lab at Asana, that her team used Advanced Data Analysis, previously called Code Interpreter, to help them analyze data for their new report on AI and work.
To see how it might help me interpret data for an article I was writing, I uploaded a dataset with information about job postings and started asking it some questions. I quickly saw how much the tool lowers the barrier to entry for data science: It let me make tables and graphs, edit files and convert them to different formats, and run statistical tests. Rather than memorize specific commands, I could just tell it what I wanted.
Genuinely impressed with what I saw, I tested Advanced Data Analysis on a handful of additional datasets to see how it performed on tasks that would be useful in a business setting. Here are some of the experiments I ran and the lessons I learned.
Background and privacy
When you ask Advanced Data Analysis to do something, it runs the command in Python, a programming language. The tool is currently a beta feature available to users of ChatGPT Plus, which costs $20 per month—you can turn it on in settings under “Beta features.” You can also access it through OpenAI’s new ChatGPT Enterprise plan (inquire for pricing).
If you access Advanced Data Analysis through ChatGPT Plus rather than ChatGPT Enterprise, you should refrain from sharing sensitive data with it, because the chatbot improves by training on the conversations it has with people. You can’t access Advanced Data Analysis while your chat history is disabled.
According to OpenAI, the ChatGPT Enterprise plan, which also offers the Advanced Data Analysis plugin, does not train on conversations.
I used a handful of different datasets to test out Advanced Data Analysis:
- A smaller dataset about jobs postings, shared by workforce intelligence company Revelio Labs.
- A smaller dataset from a recent survey by Charter about workers’ views toward AI.
- A few larger datasets from this list of public datasets.
- The summary table from WeWork’s S-1.
- The summary table from Google’s S-1.
Something you’ll notice when using Advanced Data Analysis is that the plugin seems to have a pretty good ‘understanding’ of what it’s looking at. When I gave it the job postings file, for example, it knew what it was about without me providing any additional context, and it was able to infer what abbreviations in the dataset stood for (e.g., it figured out that “postingcount_ai” stood for the number of job postings related to AI for the specific job role in the given month.)
One of the most useful features of Advanced Data Analysis is its ability to quickly find facts in a sea of data, essentially making it a very powerful Command F tool (though it’s capable of much more). Using the plugin and the Revelio Labs job postings dataset, I was able to ask questions like, ‘What are the five job categories whose job postings most often mentioned AI in July?’ and it would generate the answer in about 10 seconds. You could find this answer with traditional programs like R or Excel, but you would have to know what command to use. With Advanced Data Analysis, you can ask in plain language.
Beyond simple descriptive statistics, Advanced Data Analysis is capable of running statistical tests. For example, in our Charter survey about workers’ views toward AI, we have data on whether or not people are using generative AI in their jobs, broken down by gender. Something we noticed in the dataset is that a higher share of male workers are using generative AI in their job than female workers. But is that difference statistically significant (in other words, are we sure it isn’t due to chance)? With the Advanced Data Analysis plugin, I was able to run a chi-square test to find that the difference is, in fact, statistically significant.
Another feature that impressed me about Advanced Data Analysis is its ability to make edits to the files you give it. For the job postings file, for example, I wanted to rearrange the file by month—so all January entries next to each other, all February entries next to each other—and then in descending order for one of the variables. Here’s what I asked it to do:
Rearrange this file by putting all of the entries in month order, and then putting them in descending order for the variable “postingcount_ai_share.” So like this:
- All January entries in descending order by “postingcount_ai_share”
- All February entries in descending order by “postingcount_ai_share.”
- And so on.
Based on that prompt, Advanced Data Analysis rearranged the file and gave me a downloadable link to a new, organized spreadsheet. Users can also use Advanced Data Analysis to convert files from one format to another.
For more use cases and tips, read the full column at charterworks.com.