The Microhood: Big data, cloud computing and a million gigs a day

By Zoe Goldring

Here at JobsBlog one of our goals is to pull back the cover on jobs at Microsoft and give you a no-holds-barred look into the lives of ‘Softies. We’re also always trying to get more first person accounts of different jobs and teams at Microsoft. It’s like getting to know all the people in our neighborhood; nay our Microhood! This week we are very lucky to have Ed Harris tell us about his job as the development manager on Bing Cosmos. What I like most about Ed’s story is that he wrote it himself. This isn’t my personal view of his job or team and gives you a really good overview of what it’s like to be part of this group. Not to mention there’s beer. Yeah, you can almost always get me reading if there’s beer involved.

Cheers! Zoe

 
I’m Ed Harris, a development manager on the Bing team. I’ve been at Bing since 2004, and worked on a lot of different teams – on the front-end/consumer-facing bits, deep in the platform space, and now in our infrastructure team.

Our team’s mission is to provide the stable foundation for Bing and other online applications to run. Want to install an operating system and deploy a build on 50,000 machines with a couple of keystrokes? That’s us. Need a place to store a petabyte or two of business-critical data? That’s us too. Want to analyze every page of the web? We’ve got what you need. 

Our team is called Cosmos. Cosmos is the cloud storage and computing environment that Microsoft’s online properties use for data storage, analytics, and “Big Data” computation. Every day, we load or generate petabytes of new data – that’s one million gigabytes per day! In the online business data is the most precious commodity of all – whether it’s our relevance experiments, our copies of the web, or a thousand other data sets that get curated - all of it is mission critical. 

All of that is great, but the thing I enjoy most about being in the infrastructure team is the diversity of what we get to work on. To build an efficient storage engine, you need brilliant system programmers who can make the disks do amazing and unnatural things. On top of that, you have to layer data integrity – at the scale we operate we have random bits flip from zero to one or vice-versa on a regular basis. So besides being fast, the storage layer needs to constantly scrub itself. We also have folks who are world-class compression experts. To be cost effective we need to squeeze the most data possible onto the disks, but not spend too much CPU doing it. Up further in the stack is our execution and computation team. They are able to take a user query, optimize it, and schedule it to run on tens of thousands of PCs. You can write a three line query in our language (called SCOPE), that actually turns into a map/reduce job across twenty thousand servers.

With that diversity of problem solving comes an incredible group of people. Our problem-space is on the cutting edge of cloud computing, and has attracted a group of rock-star engineers. It’s an infectiously collaborative environment, and not a day goes by when I don’t have the opportunity to learn a new technique or algorithm.

Though we take our responsibilities very seriously as the custodians of Bing and Microsoft’s data assets - we also have a lot of fun! We take time for team events like hiking or flash-mob Frisbee as a way to get to know each other and celebrate our hard work. On a daily basis, it’s fairly common to see team members cutting loose after work – the floor reverberates on a regular basis from aggressive Kinect volleyball and dance central contests. Aside from these happenings, we also built our very own infrastructure keg fridge this year.

It’s called the InfraKegerator and we now have three different kinds of beer and home-made root beer on tap. And because we are all geeks, we intend to fully integrate our InfraKegerator into our datacenter automation this fall. We want biometric access control so other teams don’t steal our beer, alerts to fire when the kegs run low, 24/7 temperature monitoring, and minable data on which microbrews are most popular. It’s just one of the ways that we tie together work and personal interests to create a truly amazing workplace.

Working in this kind of environment – with a great set of hard problems to solve, a world-class group of coworkers, and a team that knows how to cut loose – this is what I love most about being at Bing. 

 

 Bing's Cosmos team is hiring software development engineers! Click here to view a list of openings with this team and find your place at Microsoft.