= 53
TOPIC_ID = topic_df.loc[topic_df.topic_id == TOPIC_ID]
row "topic_id", "short_title", "mentions"]] row[[
topic_id | short_title | mentions | |
---|---|---|---|
81 | 53 | AI Model Scaling | 81.0 |
The topic search API (along with our other semantic APIs) produce high level insights. In order to both dive deeper into and verify these insights, we provide a mechanism to retrieve the underlying data with our query
API. This api shares a unified filtering engine with our higher level semantic apis. Any semantic rollup or insight aggregation can be instantly “unrolled”.
Let’s take the topic AI Model Scaling. We can uncover the topic metadata below and see that it was mentioned 81 times in the corpus.
We can call the index.query
API, passing in our topic_id
. We can see that we have 81 mentions returned, lining up exactly with our aggregate apis. Below we display the first and last result of our search, as well as highlight a few terms to make the excerpts easier to read.
You will notice that accompanying each excerpt is a set of tags. These are the same tags that are returned in our topicSearch
api. Here each tag corresponds to the percentage of the paragraph that it comprises.
df = index.query(topic_id=TOPIC_ID, max_excerpts_per_doc=200, limit=200) ## 200 is the single request limit
displayText(df.iloc[[0, -1]], highlight=["ai", "generation", "train", "data", "center", "scale"])
assert len(df) == row.iloc[0].mentions
Satya Nadella: Thanks for the question. First, yes, the OpenAI partnership is a very critical partnership for us. Perhaps, it’s sort of important to call out that we built the supercomputing capability inside of Azure, which is highly differentiated, the way computing the network, in particular, come together in order to support these large-scale training of these platform models or foundation models has been very critical. That’s what’s driven, in fact, the progress OpenAI has been making. And of course, we then productized it as part of Azure OpenAI services. And that’s what you’re seeing both being used by our own first-party applications, whether it’s the GitHub Copilot or Design even inside match. And then, of course, the third parties like Mattel. And so, we’re very excited about that. We have a lot sort of more sort of talk about when it comes to GitHub universe. I think you’ll see more advances on the GitHub Copilot, which is off to a fantastic start. But overall, this is an area of huge investment. The AI comment clearly has arrived. And it’s going to be part of every product, whether it’s, in fact, you mentioned Power Platform, because that’s another area we are innovating in terms of corporate all of these AI models.
…
Susan Li: Brian, I’m happy to take your second question about custom silicon. So first of all, we expect that we are continuing to purchase third-party silicon from leading providers in the industry. And we are certainly committed to those long-standing partnerships, but we’re also very invested in developing our own custom silicon for unique workloads, where off-the-shelf silicon isn’t necessarily optimal and specifically, because we’re able to optimize the full stack to achieve greater compute efficiency and performance per cost and power because our workloads might require a different mix of memory versus network, bandwidth versus compute and so we can optimize that really to the specific needs of our different types of workloads. Right now, the in-house MTIA program is focused on supporting our core ranking and recommendation inference workloads. We started adopting MTIA in the first half of 2024 for core ranking and recommendations inference. We’ll continue ramping adoption for those workloads over the course of 2025 as we use it for both incremental capacity and to replace some GPU-based servers when they reach the end of their useful lives. Next year, we’re hoping to expand MTIA to support some of our core AI training workloads and over time, some of our Gen AI use cases.