Business Intelligence: December 2013

Saturday, December 28, 2013

Doctors will routinely use your DNA to keep you well

by IBM Research

Watson sifted four Terabytes of data to play Jeopardy! Now, it’s sorting even more healthcare data with the likes of WellPoint, Memorial Sloan-Kettering Cancer Center, MD Anderson Cancer Center, and Cleveland Clinic. Next, IBM predicts that over the next five years, similar cognitive systems will help doctors unlock the Big Data of DNA to pin point cancer therapy for their patients.

Already, full DNA sequencing is helping some patients fight cancer. For example, Dr. Lukas Wartman famously beat leukemia using treatments that were tailored to the DNA mutations of his cancer cells. While previous leukemia treatments had failed, full genome sequencing of Wartman’s healthy cells and cancer cells revealed that a drug normally used for kidney cancer might work. It did.

This breakthrough has led to tremendous advances in cancer therapy based on DNA mutations, rather than simply the location of the cancer in the body.

But today, Big Data can get in the way of these breakthroughs for patients because doctors must correlate data from full genome sequencing with reams of medical journals, studies and clinical records at a time when medical information is doubling every five years. This process is expensive and time-consuming, and available to too few patients.

IBM is building cognitive systems connected in the cloud to bring these tailored treatment options to more patients around the world. The speed of these insights through cognitive systems could save the lives of cancer patients who have no time to lose.

View the storymap
How to personalize cancer treatment

Once a doctor sequences your full genome as well as your cancer’s DNA, mapping that information to the right treatment is difficult. Today, these types of DNA-based plans, where available, can take weeks or even months. Cognitive systems will decrease these times, while increasing the availability by providing doctors with information they can use to quickly build a focused treatment plan in just days or even minutes – all via the cloud.

Within five years, deep insights based on DNA sequencing will be accessible to more doctors and patients to help tackle cancer. By using cognitive systems that continuously learn about cancer and the patients who have cancer, the level of care will only improve. No more assumptions about cancer location or type, or any disease with a DNA link, like heart disease and stroke.

Which Of The Five Types Of Data Science Does Your Startup Need?

Credit: O'Reilly

Startups, you are doing data science wrong. That’s the title of a post penned by Ryan Weald in GigaOm this week. Weald echoes DJ Patil’s idea: “product-focused data science is different than the current business intelligence style of data science.”

Weald points to a different model of data scientist, an engineer, not a statistician, who can perform queries and based upon some insights, improve the product with a few code changes and a push to git.

I like Weald’s post but disagree on one point. I don’t think there is one type of data scientist, but five.

Quantitative, exploratory data scientists tend to have PhDs and use theory to understand behavior. I count Hal Varian, Chief Economist at Google, and Redpoint’s own Jamie Davidson, among them. Varian’s team researches the advertiser dynamics within the ads auction and compares those dynamics to theoretical auction models like the Vickery auction. By combining theory and exploratory research, these data scientists improve products.
Operational data scientists often work in the finance, sales or operations teams at Google. In the AdSense ops team where I started, we had a star data analyst who each week would discuss our team’s performance: our email response times, the satisfaction scores of our publishers, and changes in publisher behavior by segment. His work provided a feedback loop to improve the team’s tactics and efficiency. Only infrequently were these insights used to influence product.
Product data scientists tend to belong to product management or engineering. This is the group of data scientists Weald writes about. PMs and engineers sift through logs and analysis tools to understand the way users interact a product and leverage that knowledge to refine the product. At Google, the ads quality team analyzed user clicks data to improve ad targeting.
Marketing data scientists segment the user base, evaluate the performance of advertising campaigns, match product features to customer segments, and design content marketing campaigns. The marketing data scientist creates awareness and leads for the sales team, helping generate revenue.
Research data scientists create insights as a product. Nate Silver is arguably the most famous of them. Silver’s work doesn’t influence a product; the analysis is the product itself. Sometimes the data science leads to a thought leadership whitepaper, or a blog post, or a financial report. It’s rarer for startups to employ research scientists because the output isn’t tied to revenue. But larger companies like Google do, think tanks do, financial institutions do.

These five types of data scientists span almost every department of knowledge work. Sometime in the past thirty years, data science became inextricable from the day-to-day operation of these teams. Product, marketing, eng, sales all use data to make decisions. These teams use data to identify, understand and implement feedback loops and to reinforce the behavior a company desires.

To talk about data scientists might be too myopic. Your startup may need a research data scientist or one with a PhD. Or it may need an engineer with an understanding of basic statistics who can work up and down the Rails stack. Or another type all together.

Like any role, when hiring or recruiting a data scientist it’s important to identify what the key problems facing the business and the relevant skills the right candidate will need to solve those challenges.

Friday, December 13, 2013

Are You Recruiting A Data Scientist, Or Unicorn?

Guest blog by Jeff Bertolucci (InformationWeek)

Many companies need to stop looking for a unicorn and start building a data science team, says CEO of data applications firm Lattice.

The emergence of big data as an insight-generating (and potentially revenue-generating) engine for enterprises has many management teams asking: Do we need an in-house data scientist?

According to Shashi Upadhyay, CEO of Lattice, a big data applications provider, it doesn't make sense for organizations to hire a single data scientist, for a variety of reasons. If your budget can swing it, a data science team is the way to go. If not, data science apps may be the next best thing. "If you look at any industry, the top 10 companies can afford to have data scientists, and they should build data science teams," Upadhyay told InformationWeek in a phone interview.

But the solution is less clear for smaller organizations. "The pattern that I've seen now, having done this for over six years, is that very often medium-sized companies think of the problem as, 'I need to go and get me one data scientist,'" said Upadhyay.

[Guidelines aim to combat potential misuse of big data. Read Data Scientists Create Code Of Professional Conduct.]

But the shortage of data scientists, a problem that's only expected to worsenin the next few years, makes that approach a risky proposition.

For example, a company may hire one or two people, Upadhyay said, "but before you know it, because the supply for this talent group is so far behind demand, they have lost this person [who] has gone to the next company. And all of a sudden, all that good work is lost. And you ask yourself, 'Why did that happen? And how can I manage against it?'"

One common problem, he noted, is that companies simply don't understand data scientists and how they work. The job generally requires knowledge of a wide array of technical disciplines, including analytics, computer science, modeling, and statistics. "They also tend to be fairly conversant in business issues," Upadhyay added.

But it's often difficult to find these divergent skills in a single human being. "It's a little bit like looking for a unicorn," Upadhyay said.

When medium-sized companies -- those that fall below the top five in a given industry, for instance -- hire just one or two data scientists, they often can't provide a long-term career path for those people within the company. As a result, the data scientists get frustrated and move onto the next thing.

In Silicon Valley, where data scientists command six-figure salaries and are in great demand, it's very difficult to retain talented people.

The better solution? Build a team.

"You will absolutely get a benefit if you hire a data science team," said Upadhyay. "Go all the way [and] commit to creating a creating a career path for them. And if you do it that way, you will get the right kind of talent because people will want to work for you."

Smaller companies that can't afford data science teams should consider big data applications instead. The biggest firms -- in Upadhyay's words, "the Dells, HPs, and Microsofts of the world" -- can take both approaches: data science teams and big data apps.

The team approach seems to be winning. "I rarely see teams that are one or two people in size," Upadhyay observed. "Obviously people have those teams, but they tend to evaporate over time. Until they get to a team of 10 people or more, [companies] can't justify it."

So what does a data science team cost, and what's the payoff?

Upadhyay offered this example: Say you hire a team of 10 data scientists with an average annual cost of $150,000 per employee. "That's $1.5 million for a data science team," he said. "So they better be creating at least $15 million dollars in value for you -- 10 times [the expense] -- to be worth it."

Emerging software tools now make analytics feasible -- and cost-effective -- for most companies. Also in the Brave The Big Data Wave issue of InformationWeek: Have doubts about NoSQL consistency? Meet Kyle Kingsbury's Call Me Maybe project. (Free registration required.)

Google Track

Saturday, December 28, 2013

Doctors will routinely use your DNA to keep you well

Which Of The Five Types Of Data Science Does Your Startup Need?

Friday, December 13, 2013

Are You Recruiting A Data Scientist, Or Unicorn?