AI

Building Cloud Search as a Service with AI

AI Search

It’s been almost a year since I joined Azure Search team. A lot has changed since then. I joined right after team doubled by merge with Text Analytics team with a mission to add intelligence to search. A few months later entire Cognitive Services (Azure Machine Learning APIs) platform team joined us. Then we hired additional developers to build scalable platform for both Cognitive Services and Azure Search. After that we also got a team of data scientists who are building the actual machine learning models. Now, as the Applied AI team, we are in the center of AI and Cloud at Microsoft.

Azure Search is a search-as-a-service cloud solution that gives developers APIs and tools for adding a rich search experience in web and mobile applications. You get for free things like autocomplete, suggestions, synonyms, results highlighting, facets, filters, sorting and paging. Functionality is exposed through REST API or .NET SDK. The biggest pain, which is infrastructure and availability are managed by us.

While having all of that, we also need a great developer experience. Everybody needs to be able to understand how to build that Search AI pipeline without spending hours on reading docs. This is another thing we are working on. Email me or tweet message me if you are interested in that kind of stuff.

Where are we going?

Cognitive Search

We want to build the best Search as a Service platform that enables developers to add Bing-like Google-like search experience to their websites. No need for hiring search experts who know what inverted index is. No challenges with shard allocation and how to implement master election properly. No need for distributed systems expertise to scale this for large amount of data. Last, but not least: no need for setting up, owning and managing the infrastructure. Everything is being taken care of by the platform. By the Cloud.

Our team is also working on market-leading Machine Learning APIs. We are going to utilize these ML models and enable you to search through not only text, but also through your images, audio and videos.

There is a lot of challenges in that journey. From processing large amounts of data, through doing it in reasonable time (performance/parallelization), to providing efficient user experience throughout the process.

Where are we now?

We already have fast, reliable and production-ready system for full-text search. You can provision it in no-time, scale by adding more replicas or partitions, and monitor using metrics we provide. You can query it with .NET SDK or using REST API. We even have Open Source UI generation tool that gets you started with the latter: AzSearch.js.

To learn more about current capabilities of Azure Search check this awesome presentation by Bryan Soltis:

There are two ways to populate your search index: by simply inserting documents (records) into it, or by using indexer – a mechanism that enables you to sync your search index with your data source (SQL or NoSQL Database, blob storage, etc.).

We have already started adding AI to our search pipeline, by enabling you to run text analytics and OCR on your data. If you are using indexer, you can create a skillset, which can detect people, entities, organizations, locations, key phrases, and language on the textual data. On top of that you can use OCR that can recognize text from your images, and enable you to search through that text. You can also run mentioned text analytics on recognized text. We call this approach Cognitive Search. Here is a quick video by Brian and Corom from our team, with a sneak peak of what’s possible:

Last year we created a prototype of Cognitive Search, using JFK files that went public. You can check out our JFK files website, github repo and below video from Connect(); conference in 2017, where Corom explaines how he built a pipeline to achieve what is possible now with just checking the checkbox:

We announced Cognitive Search at the //build conference earlier this year. Together with NBA we built a website that allows you to search through player’s photos. You can search for players, their shoes or correlations between them:

Similar approach can be used for variety of different scenarios. From filtering your family photos, through analyzing medical records data, to deciding which crypto-currency to buy. Now, all these PDFs and doc documents you have on your hard drive can be used to make an informed business decision.

There are a lot of companies using Azure Search in production. It’s super exciting for me that Real Madrid is using Azure Search. It’s my favorite football club since I was a kid.

How’s the team?

My favorite thing about our team are the people. Every single person is bringing something else to the table, and there is something you can learn from each one of them. From distributed systems expertise, through API design, to building efficient monitoring infrastructure that enables to maintain production cloud service. One of our team members is Henrik Frystyk Nielsen who is best known for his pioneering work on the World Wide Web and subsequent work on computer network protocols. Currently he works on encapsulating Machine Learning models into containers. Our manager, Pablo Castro started not only Azure Search, but also OData protocol and LINQ to Entities. Our Project Manager Lance Olson was one of the founders of the .NET! You can check out what people say about our team on blind! Search for “Azure Search” 😉 There is also a blog post written by Pablo a few years ago: Startup at Microsoft. A lot has changed since then. We went through a few rounds of “funding”, and our team grew. However, we still believe in core values expressed there. For example: every engineer from the team still talks to customers on daily basis either through social media or directly over email or Skype.

BTW: We are hiring!


I am joining Cloud AI team to work on Azure Search

Azure Search

It has been over 3 years since I joined the Azure Portal team. During that time I learned a lot about every aspect of web and mobile development. I delivered over 20 technical talks at different conferences around the World and local meetups. It was amazing to take the new Portal from preview to v1. In the meantime, during the //oneweek hackathon, together with a few other folks, we built a prototype of the Azure Mobile App. After getting feedback from Scott Guthrie who said that “it would be super useful” I started working on the app overnight.

I didn’t know much about mobile development at the time, but I wanted to learn. I didn’t know much about complexities of Active Directory authentication and Azure Resource Manager APIs. I just knew that it would be super cool to have an app that would allow me to check the status of my Azure resources while waiting for my lunch. Receiving a push notification, and being able to scale VM from my phone would be also tremendously valuable.

When I started working on the app full time, my dream came true. I could truly connect my passion with work. I enjoyed the long hours, and late nights we all put to make it happen. The day when Scott Hanselman presented the Azure App at the //build conference was on of the best days of my life.

Now, when the Azure App is released, and backed by great team, I can move to the next challenge.

Machine learning is becoming part of every aspect of our lives. Over last few years, ML crossed a threshold necessary to be extremely useful. I always wanted to be part of it. I took a great Coursera class by Andrew Ng, I started overnight project StockEstimator and I got involved in SeeingAI to learn how Real-World Machine Learning looks like.

Now, I’m taking it to the next level. I am joining Azure Search Team to lead their User Experience. I will be responsible for bringing the product to customers. While using my existing web development knowledge, I will have an amazing opportunity to learn more about Big Data, AI and ML.

Azure Search is managed cloud search service that offers scalable full-text search over multiple languages, geo-spatial search, filtering and faceted navigation, type-ahead queries, hit highlighting, and custom analyzers. You can find more details in this talk by Pablo Castro (Azure Search manager and creator of Open Data Protocol).

The cool thing about working for Microsoft is that you may end up working with person who created HTTP protocol. Henrik Frystyk Nielsen, former Tim Berners-Lee’s student, who shared office with Håkon Wium Lie (creator of CSS), joined my new team this month. What’s even cooler, he is sitting next to me 🙂

In my new office with Henrik:

Henrik Frystyk Nielsen and Jacob Jedryszek

If you want to learn more about all the cool stuff we are doing at Cloud AI group there is an awesome .NET Rocks Podcast with Joseph Sirosh. Check it out!

There is also awesome talk by Joseph from the last Connect(); conference, which includes JFK files demo presented by Corom Thompson from my team (creator of How-Old.NET). In that demo Corom showcases how you can use Azure Search and Cognitive Services to explore JFK files. Super cool! You can see demo in below video, and code on github.

It has never been a better time to work on the intersection of Cloud and Artificial Intelligence!