AI Made Friendly HERE

InfoQ AI, ML and Data Engineering Trends Report – September 2024

Key Takeaways

  • The future of AI is open and accessible. We’re in the age of LLM and foundation models. Most of the models available are closed source, but companies like Meta are trying to shift the trend toward open-source models.
  • Retrieval Augmented Generation (RAG) will become more important especially for applicable use cases of LLMs at scale.
  • AI-powered hardware will get much more attention with AI-enabled GPU infrastructure and AI-powered PCs.
  • Due to the constraints in infrastructure setup and management costs of LLMs, small language models (SLMs) will see more exploration and adoption.
  • Small language models are also excellent for edge computing-related use cases that run on small devices.
  • AI Agents, like coding assistants, will also see more adoption, especially in corporate application development settings.
  • AI safety and security will continue to be important in the overall management lifecycle of language models. Self-hosted models and open-source LLM solutions can help improve the AI security posture.
  • Another important aspect of the LLM lifecycle is LangOps or LLMOps, which help support the models after deploying them to production.

The InfoQ Trends Reports offer InfoQ readers a comprehensive overview of emerging trends and technologies in the areas of AI, ML, and Data Engineering. This report summarizes the InfoQ editorial team’s podcast with external guests to discuss the trends in AI and ML and what to look out for in the next 12 months. In conjunction with the report and trends graph, our accompanying podcast features insightful discussions of these trends.

AI and ML Trends Graph

An important part of the annual trends report is the trends graph, which shows what trends and topics have made it to the innovators category and which ones have been promoted to early adopters and early majority categories. The categories are based on the book “Crossing the Chasm”, by Geoffrey Moore. At InfoQ, we mostly focus on categories that have not yet crossed the chasm. Here is this year’s graph:

AI technologies have seen significant innovations since the InfoQ team discussed the trends report last year.

This article highlights the trends graph that shows different phases of technology adoption and provides more details on individual technologies that were added or updated since last year’s trend report. We also discuss which technologies and trends have been promoted in the adoption graph.

Here are some highlights of what has changed since last year’s report:

Innovators

Let’s first start with new topics added to the Innovators category. Retrieval Augmented Generation (RAG) techniques will become crucial for corporations who want to use LLMs without sending them to cloud-based LLM providers. RAG will also be useful for applicable use cases of LLMs at scale.

Another new entrant in the Innovators category is the AI-integrated hardware, which includes AI-enabled GPU infrastructure and AI-powered PCs, mobile phones, and edge computing devices. This is going to see significant development in the next 12 months.

LLM-based solutions have challenges in the areas of infrastructure setup and management costs. So, a new type of language model called Small Language Models (SLMs) will be explored and adopted. SLMs are also perfect for edge computing related use cases to run on small devices. Companies like Microsoft have released Phi-3 and other SLMs that the community can start trying out immediately to compare the cost and benefits of using an SLM versus an LLM.

Early Adopters

With the tremendous changes in Generative AI technologies and the recent release of the latest versions of LLMs from tech companies like OpenAI (GPT-4o), Meta (LLAMA3), and Google (Gemma), we think “Generative AI / Large Language Models (LLMs)” topic is now ready to be promoted from Innovators category to Early Adopters category.

Another topic moving into this category is Synthetic Data Generation, as more and more companies are starting to use it heavily in model training.

Early Majority

AI coding assistants will also see more adoption, especially in corporate application development settings. So, this topic is being promoted to the Early Majority category.

Image recognition techniques are also used in several industrial organizations for use cases like defect inspection and detection to help with preventive maintenance and minimize or eliminate machine failures.

AI and ML Trends Highlights

Innovations in Language Models

ChatGPT was rolled out in November 2022. Since then, Generative AI and LLM technologies have been moving at the maximum speed in terms of innovation, and they don’t seem to be slowing down any time soon.

All the major players in the technology space have been very busy releasing their AI products.

Earlier this year, at the Google I/O Conference, Google announced several new developments, including Google Gemini updates and “Generative AI in Search,” which will significantly change the search as we know it.

Around the same time, OpenAI released GPT-4o, the “omni” model that can work with audio, vision, and text in real time.

LLAMA3 was also released by Meta at the same time with their recent release of LLAMA version 3.1, which is based on 405 billion parameters.
Open-source LLM solutions like OLLAMA are getting a lot of attention.

Generative AI, with the release of the latest versions of their Large Language Models (LLMs) by major players in this space, like GPT-4o, LLAMA3, and Gemini, continues to be a major force in the AI and ML industry. There are newer versions of the popular foundation language models, and other companies like Anthropic (Claude) and Mistral (Mixtral) are offering LLMs.

Another new trend in LLMs is the context length; the amount of data you can put into the model to give you an answer is increasing. Mandy Gu discussed longer context windows in this year’s podcast:

“It’s definitely a trend that we’re seeing with longer context windows. And originally, when ChatGPT and LLMs first got popularized, this was a shortcoming that a lot of people brought up. It’s harder to use LLM at scale, or as more as you called it, when we had restrictions around how much information we could pass through it. Earlier this year, Gemini, the Google Foundation, this GCP foundational model, introduced the one million context window length, and this was a game changer because, in the past, we’ve never had anything close to it. I think this has sparked the trend where other providers are trying to create similarly long or longer context windows.”

Small Language Models

Another major trend in language model evolution is the new small language models. These specialized language models offer many of the same capabilities found in LLMs but are smaller, trained on smaller amounts of data, and use fewer memory resources. Small language models include Phi-3, TinyLlama, DBRX, and Instruct.

LLM Evaluation

With several LLM options available, how do we compare different language models to choose the best model for different data or workloads in our applications? LLM evaluation is integral to successfully adopting AI technologies in companies. There are LLM comparison sites and public leaderboards like Huggingface’s Chatbot Arena, Stanford HELM, Evals framework and others.

The InfoQ team recommends that the LLM app developers use domain/use-case-specific private evaluation benchmarks to track changes in model performance. Use the human-generated ones, ideally those that have not yet been polluted/leaked into LLM training data, to help provide an independent measure of model quality over time.

From the podcast,

“[…] Business value is something important we should think about for evaluation. I’m also quite skeptical of these general benchmarking tests, but I think what we should really do is…evaluate the LLMs, not just the foundational models, but the techniques and how we orchestrate the system at the task at hand. So if, for instance, the problem I’m trying to solve is…to summarize a research paper [and] I’m trying to distill the language, I should be evaluating the LLMs capabilities for this very specific task because going back to the free lunch theorem, there’s not going to be one set of models or techniques that’s going to be the best for every task.”

AI Agents

The AI-enabled agent programs are another area that’s seeing a lot of innovation. Autonomous agents and GenAI-enabled virtual assistants are coming up in different places to help software developers become more productive. AI-assisted programs can enable individual team members to increase productivity or collaborate with each other. Gihub’s Copilot, Microsoft Teams’ Copilot, DevinAI, Mistral’s Codestral, and JetBrains’ local code completion are some examples of AI agents.

GitHub also recently announced its GitHub Models product to enable the large community of developers to become AI engineers and build with industry-leading AI models.

To quote Roland Meertens from the podcast,

“We see some things like, for example, Devin, this AI software engineer where you have this agent which has a terminal, a code editor, a browser, and you can basically assign it as a ticket and say, ‘Hey, try to solve this.’ And it tries to do everything on its own. I think at the moment Devin had a success rate of maybe 20%, but that’s pretty okay for a free software engineer.”

Daniel Dominguez mentioned that Meta will offer a new Meta AI agent for small businesses to help small business owners automate a lot of things in their own spaces. HuggingChat also has AI agents for daily workflows. Slack now has AI agents to help summarize conversations, tasks, or daily workflows.

AI Powered Hardware

AI-integrated hardware is leveraging the power of AI technologies to revolutionize the overall performance of every task. AI-enabled GPU infrastructure like NVIDIA’s GeForce RTX and AI-powered PCs like Apple M4, mobile phones, and edge computing devices, can all help with faster AI model training and fine-tuning as well as faster content creation and image generation.

Safe and Secure AI Solutions

With all the developments in Gen AI and Language Models, safely and securely deploying these AI applications is crucial to protect consumer and company data privacy and security.

With the emergence of multi-model language models like GPT-4o, privacy and security when handling non-textual data like videos become even more critical in the overall machine learning pipelines and DevOps processes.

The podcast panelist’s AI safety and security recommendations are to have a comprehensive lineage and mapping of where your data is going. Train your employees to have proper data privacy security practices, and also make the secure path the path of least resistance for them so everyone within your organization easily adopts it. Other best practices include: Making sure your workflow has auditability in place so that you can trace all the interactions between all the inferences. Some relevant questions: Are there potential attack surfaces in my design workflow? What about prompt injection?

LangOps/LLMOps

Another critical aspect of LLMs and AI technologies is hosting language models in production and managing the entire life cycle of language models. LangOps or LLMOps consists of best practices in supporting the models in production.

Mandy Gu shared her team’s LLMOps experience based on LLM efforts at their company:

“We started enabling self-hosted models where we can easily add an open-source model, we fine-tune it, we can add it to our platform, make it available for inferencing for both our systems and for our end users through the LLM gateway. And then from there, we looked at building retrieval as a reusable API, building up the scaffolding and accessibility around our vector database. And then slowly as we started platformatizing more and more of these components, our end users who are like the scientists, the developers, various folks within the business, they started playing around with it and identifying like, ‘Hey, here’s a workflow that would actually really benefit from LLMs.’ And then this is when we step in and help them productionize that, deploy it, and deploy the products at scale.”

AR/VR Technologies

Due to time constraints in the podcast, the group didn’t have time to discuss the AR/VR latest trends, but this is an important topic so we want to briefly cover it in this article.

Augmented Reality and Virtual Reality applications can benefit greatly from the latest AI technology innovations. Apple and Meta have recently released their VR products, including Apple Vision Pro, Meta Quest Pro + Meta Quest 3, and Ray-Ban Meta. All these products can take the application development and user experience to the next level by leveraging the innovations of the AI and Language Models.

Conclusion

The future of AI is open and accessible. Even though most currently available models are closed source, companies are trying to shift the trend toward open-source models. This year, Retrieval Augmented Generation (RAG) will become more important especially for applicable use cases of LLMs at scale. AI-enabled hardware like PCs and edge devices will get a lot more attention. Small language models (SLMs) will also be explored and adopted more. These models are excellent for edge computing related use cases to run on small devices. The security and safety aspects in AI-based applications will continue to be important in the overall management lifecycle of language models.

Originally Appeared Here

You May Also Like

About the Author:

Early Bird