Commentary: After analyzing a lot of recent Microsoft developer content, expert Simon Bisson says there is a big clue into how Bing Chat will work.
Image: gguy/Adobe Stock
If there’s one thing to know about Microsoft, it’s this: Microsoft is a platform company. It exists to provide tools and services that anyone can build on, from its operating systems and developer tools, to its productivity suites and services, and on to its global cloud. So, we shouldn’t be surprised when an announcement from Redmond talks about “moving from a product to a platform.”
The latest such announcement was for the new Bing GPT-based chat service. Infusing search with artificial intelligence has allowed Bing to deliver a conversational search environment that builds on its Bing index and OpenAI’s GPT-4 text generation and summarization technologies.
Instead of working through a list of pages and content, your queries are answered with a brief text summary with relevant links, and you can use Bing’s chat tools to refine your answers. It’s an approach that has turned Bing back to one of its initial marketing points: helping you make decisions as much as search for content.
SEE: Establish an artificial intelligence ethics policy in your business using this template from TechRepublic Premium.
ChatGPT has recently added plug-ins that extend it into more focused services; as part of Microsoft’s evolutionary approach to adding AI to Bing, it will soon be doing the same. But, one question remains: How will it work? Luckily, there’s a big clue in the shape of one of Microsoft’s many open-source projects.
Jump to:
Semantic Kernel: How Microsoft extends GPT
Microsoft has been developing a set of tools for working with its Azure OpenAI GPT services called Semantic Kernel. It’s designed to deliver custom GPT-based applications that go beyond the initial training set by adding your own embeddings to the model. At the same time, you can wrap these new semantic functions with traditional code to build AI skills, such as refining inputs, managing prompts, and filtering and formatting outputs.
While details of Bing’s AI plug-in model won’t be released until Microsoft’s BUILD developer conference at the end of May, it’s likely to be based on the Semantic Kernel AI skill model.
More must-read AI coverage
Designed to work with and around OpenAI’s application programming interface, it gives developers the tooling necessary to manage context between prompts, to add their own data sources to provide customization, and to link inputs and outputs to code that can help refine and format outputs, as well as linking them to other services.
Building a consumer AI product with Bing made a lot of sense. When you drill down into the underlying technologies, both GPT’s AI services and Bing’s search engine take advantage of a relatively little-understood technology: vector databases. These give GPT transformers what’s known as “semantic memory,” helping it find links between prompts and its generative AI.
A vector database stores content in a space that can have as many dimensions as the complexity of your data. Instead of storing your data in a table, a process known as “embedding” maps it to vectors that have a length and a direction in your database space. That makes it easy to find similar content, whether it’s text or an image; all your code needs to do is find a vector that is the same size and the same direction as your initial query. It’s fast and adds a certain serendipity to a search.
Giving GPT semantic memory
GPT uses vectors to extend your prompt, generating text that’s similar to your input. Bing uses them to group information to speed up finding the information you’re looking for by finding web pages that are similar to each other. When you add an embedded data source to a GPT chat service, you’re giving it information it can use to respond to your prompts, which can then be delivered in text.
One advantage of using embeddings alongside Bing’s data is you can use them to add your own long text to the service, for example working with documents inside your own organization. By delivering a vector embedding of key documents as part of a query, you can, for example, use a search and chat to create commonly used documents containing data from a search and even from other Bing plug-ins you may have added to your environment.
Giving Bing Chat skills
You can see signs of something much like the public Semantic Kernel at work in the latest Bing release, as it adds features that take GPT-generated and processed data and turn them into graphs and tables, helping visualize results. By giving GPT prompts that return a list of values, post-processing code can quickly turn its text output into graphics.
As Bing is a general-purpose search engine, adding new skills that link to more specialized data sources will allow you to make more specialized searches (e.g., working with a repository of medical papers). And as skills will allow you to connect Bing results to external services, you could easily imagine a set of chat interactions that first help you find a restaurant for a special occasion and then book your chosen venue — all without leaving a search.
By providing a framework for both private and public interactions with GPT-4 and by adding support for persistence between sessions, the result should be a framework that’s much more natural than traditional search applications.
With plug-ins to extend that model to other data sources and to other services, there’s scope to deliver the natural language-driven computing environment that Microsoft has been promising for more than a decade. And by making it a platform, Microsoft is ensuring it remains an open environment where you can build the tools you need and don’t have to depend on the tools Microsoft gives you.
Microsoft is using its Copilot branding for all of its AI-based assistants, from GitHub’s GPT-based tooling to new features in both Microsoft 365 and in Power Platform. Hopefully, it’ll continue to extend GPT the same way in all of its many platforms, so we can bring our plug-ins to more than only Bing, using the same programming models to cross the divide between traditional code and generative AI prompts and semantic memory.