Is the End of Data Analysis Near? The Impact of Code Interpreter, the New Feature of GPT-4


Posted by ar851060 on 2023-07-17

Published on: July 13, 2023

The recent release of the code interpreter feature in ChatGPT Plus has brought about a wave of excitement and shock with its powerful data analysis capabilities. This new feature allows users to upload files, analyze data, view Python code, receive analysis suggestions, perform exploratory data analysis (EDA), preprocess and clean data, conduct feature engineering, build models, make predictions, and provide decision suggestions. Moreover, it can also explain code, assist in debugging, identify code efficiency bottlenecks, teach, and even write code. Undoubtedly, this code interpreter is a game-changer in the fields of data science and software engineering.

While some individuals remain optimistic about the potential of code interpreter to replace data analysts, it is essential to acknowledge the likelihood of such an outcome. Reflecting on the characters in the movie "Hidden Figures," whose jobs as human "computers" were eventually replaced by actual computers, it becomes clear that technological advancements tend to reshape industries. Although code interpreter currently has some limitations, such as the inability to connect to databases or handle large datasets, these challenges are likely to be overcome in due course. Most important is that the public release of ChatGPT occurred on November 30, 2022, less than a year ago. While AI certainly has its own set of problems, it is a relatively nascent field that continues to evolve rapidly.

Impact on Different Roles

Let's explore the impact of code interpreter on various roles within the data science and engineering domain:

  • Data Analysts and Junior Data Scientists

    Among the most affected by this new feature are data analysts and junior data scientists. If the job description of a data scientist is primarily focused on AI skills, the impact may be minimal. However, the value of a data analyst lies in uncovering insights and reporting them to clients. In the traditional data analysis process, one identifies a problem, formulates it as a data question, analyzes the data to find insights, creates slides, and presents the findings to the client. With the exception of problem identification and actionable suggestions, all other steps can now be automated using AI. In fact, problem identification and actionable suggestions heavily rely on domain knowledge, making it even more critical than statistical and coding skills. It is plausible to speculate that in the near future, the role of data analyst may evolve into that of a business analyst (BA) or project manager (PM), focusing on analyzing products or projects. BAs and PMs would only need to take introductory data science courses to leverage ChatGPT for analysis. Similarly, junior data scientists may face similar challenges.

  • AI Engineers and Machine Learning Engineers

    In the near future, the majority of tasks solvable by AI will likely be addressed by AI models from large companies like OpenAI. These models require significant computational resources and can only be trained by major industry players such as Google, Meta, and Microsoft. Roughly 80% of AI engineers may find themselves primarily responsible for applying these existing AI models, while the remaining 20% may work closely with these massive models, either in the context of distilling them for data security purposes or assisting individuals unfamiliar with implementing AI in their own settings. Similar to cloud engineers, the majority of AI engineers will need to learn how to effectively utilize these powerful AI models, and only small part of AI engineers can dive into the algorithm of AI.

  • Software Engineers and Data Engineers

    Data engineers, who play a crucial role in data collection, transformation, and storage, face relatively less impact in the data science job landscape. After all, without data, there can be no AI (excluding zero-shot learning advancements). Although code interpreter currently provides suggestions for code improvement, code reviews, and assistance, it may eventually evolve into a low-code platform, reducing the demand for software engineers. While data engineers may face some challenges, their expertise in data handling will remain relevant for the foreseeable future.

  • Entry-Level White-Collar Workers and Junior Managers

    With a significant portion of tasks now automated by AI, entry-level white-collar workers and junior managers may find their roles impacted. However, it is worth noting that user experience (UX) design may still require a more human touch, making it a field that could face significant disruption.

  • Researchers and Senior Data Scientists

    Researchers and senior data scientists, often involved in AB testing and causal inference, may also experience some impact. Although AI models can currently provide suggestions for experiments, if Physical-informed AI and Causal AI become mature, it is conceivable that a substantial portion of scientific research could be driven by these AI systems.

  • Education Sector

    Even professionals in the education sector may face challenges, as AI advances to a point where it can assist students in writing their homework. This automation may reduce the demand for human intervention in certain educational tasks. We need to re-think what we want to educate to next generation.

  • Recommendations

    For data analysts and junior data scientists, it is advisable to deepen their domain knowledge or explore the field of experimentation.

    As for AI engineers, keeping up with the latest trends while avoiding rushing into new businesses is recommended, as some AI startups have already been outperformed by the introduction of code interpreter.

    Prospective data science professionals should reconsider their career choices, as the number of data science jobs is expected to decrease, potentially leaving only the top 20% of candidates as data scientists.

    Those currently studying statistics or data science may find alternative career paths in areas involving smaller datasets, experimentation, or fields traditionally suited for statisticians. For instance, careers related to reliability and experiment design in manufacturing, actuarial work in finance, clinical research organization, and drug development in biology could be viable alternatives. Alternatively, pursuing advanced degrees such as a PhD can provide individuals with specialized data analysis skills that AI models may struggle to replicate, for example, techniques like correspondence analysis.

In conclusion, the introduction of code interpreter in ChatGPT Plus has the potential to reshape the data science and engineering landscape. While some roles may face challenges, it is important to adapt, embrace the changing landscape, and seek new opportunities that arise as AI continues to evolve. Next article, I want to show how strong the code interpreter is.


#free talk









Related Posts

【單元測試的藝術】Chap 9: 在組織中導入單元測試

【單元測試的藝術】Chap 9: 在組織中導入單元測試

[24] 強制轉型 - parseInt、Number、ToPrimitive、Boolean

[24] 強制轉型 - parseInt、Number、ToPrimitive、Boolean

目錄-資料傳輸物件資訊

目錄-資料傳輸物件資訊


Comments