Retail is Applying AI in Four Main Areas – Data Remains a Challenge

If you’ve been paying attention to the application of artificial intelligence in retail, you may feel like the buzz around the topic has gone from zero to “arrived” in less than a year. In retail time, even at the speed of the modern consumer, that is incredibly fast. Some of the hype has come from […]

If you’ve been paying attention to the application of artificial intelligence in retail, you may feel like the buzz around the topic has gone from zero to “arrived” in less than a year. In retail time, even at the speed of the modern consumer, that is incredibly fast.

Some of the hype has come from activity around specific use-cases for the application of AI in retail. While companies like Baidu profess over 100 AI capabilities, in retail it appears that use-cases are centering on four main areas:

Predictive analytics / forecasting – This is forecasting with an emphasis on either products, or customers. For products, retailers appear to be focusing on three main areas of opportunity. First, they are looking at understanding product attributes in a new, AI-driven light. By looking beyond the obvious attribute connections between products, retailers are looking to machine learning to identify and make connections between products that get lost in the noise. They are then connecting those attributes to drivers of demand, to make finer-grained predictions of how well products will sell and why. And finally, retailers are looking to incorporate non-traditional demand signals to get a better picture of demand – seeing if there are connections to be made about consumer behavior related to products that can be exploited in the future, using new kinds of data. For example, predicting that a restaurant will sell 25% more salads if the lunch-time temperature is above 80 degrees F. Or, conversely, that lettuce contamination in the headlines creates a 10% decline in salad sales.

Predictive analytics is also being applied to customer behavior. Matching product to customer behavior can be used in the product sense above, but it can also be use in a customer sense to predict the next product a specific customer would be interested in buying. It can also be used to predict when, in which channel, and at which price (or with which offer) a customer would be most likely to buy, and which product would most have their attention. This has made its way into retail through personalization solutions, mostly targeted at the digital portion of the online journey.

Voice / Natural Language Processing In – While the retail industry tends to group natural language processing (NLP) together into in AND out, in reality there are applications that focus only on inputs, and applications that more heavily focus on outputs, which are much more difficult, and covered next. On the input side, the applications focus on speech-to-text, and then text recognition, which can then be used to analyze for sentiment or emotion. Examples include call center chats or phone calls that detect when a customer might be getting angry, or traditional social media analysis that is smart enough to self-learn – so, instead of a person having to go through and note when there are exceptions to language that is traditionally considered negative (“This vacuum sucks” is sometimes not a bad thing), the AI will be able to detect and categorize exceptions on its own over time.

Voice / NLP Out – The output side is much harder, because it requires the AI to approximate human behavior enough to sound “natural”. Chatbots are on the learning curve, as are automated copy writers. Chatbots are a little easier to pull off because you can rely on a smaller subset of information to seed the chatbot, and they tend to be focused on specific objectives, like problem solving or sales. Copy is a lot harder because it tends to rely on a broader range of inputs and human expectations might include more difficult language concepts like metaphors or poetic license. But retailers are looking to these capabilities to either offset human communication and customer service costs, as in a call center, or to be able to generate a lot more unique copy about products a lot faster – or both.

Read the source article in Forbes.

Here Are 5 Ways Big Data Is Revolutionizing the Agriculture Industry

Big data and analytics are helping to improve and transform a multitude of industries in the modern world. The most impactful thing such technologies do is provide detailed and real-time insights into operational and financial activities. In agriculture, this very thing is playing out as we speak. Farmers, for instance, are using data to calculate […]

Big data and analytics are helping to improve and transform a multitude of industries in the modern world. The most impactful thing such technologies do is provide detailed and real-time insights into operational and financial activities. In agriculture, this very thing is playing out as we speak.

Farmers, for instance, are using data to calculate harvest yields, fertilizer demands, costs savings and even to identify optimization strategies for future crops.

The question is less whether or not the technology offers benefits — it indeed does — and more about the “how” it achieves such a thing. Here are five ways in which big data in agriculture is improving conditions or operations.

#1: Monitoring Natural Trends

A significant risk factor in farming and agriculture is out of the control of those doing the brunt of the work. Pest and crop diseases, for example, can decimate entire harvests, as can natural disasters like storms or extreme weather. Before big data existed, it was almost impossible to predict such events. Yes, experienced farmers may be able to spot tell-tale signs of a pest problem — but it’s often already too late by then.

Big data and monitoring technologies can track such events and even predict them entirely. By feeding past and present data into a system and extracting insights through valid algorithms, data science can effectively boost future yields. This can save farmers and supply chain stakeholders a lot of money overall, as well as help facilitate distribution patterns and supply.

Big data drives the incorporation of modern tech into the field. UAVs or drones can be used to fly over and assess land patterns. The mapping data collected can then be analyzed and scoured for useful intel. Maybe erosion in a particular section of cropland warrants dealing with this year?

Alternatively, IoT sensors can track and monitor croplands and plants remotely.

#2: Advanced Supply Tracking

In farming and agriculture today, outside of more traditional scenarios, a farmer is often beholden to a particular supplier or partner. They may, for example, be sending a certain amount of their most recent harvest to a local grocer or department chain. Regardless of who is partnered up in agriculture, it’s not always possible to know precisely how much and when a particular crop is going to be ready. This coupled with changing demand on the consumer side can lead to severe supply issues.

Big data can alleviate some of the problems that arise in the supply chain, merely because it affords more oversight regarding the crops and harvest each season. This is true of not just the farmers working with the plants, but everyone else along the supply chain, too, including distributors, packagers, retailers, and more. When passed on, the data can genuinely help everyone prepare for the current progress, whether that includes greater or fewer quantities than expected.

#3: Risk Assessment

In general business, management and planning teams often have the benefit of detailed risk assessment reports. Until now, that’s never been possible in the world of agriculture. Sure, experience may dictate that taking a specific action is going to produce apparent consequences, but data-driven risk assessment affords so much more than that.

With big data, nearly every system, decision or event can be considered in the risk analysis plan. Every mistake or potential hurdle can be accounted for, along with not just the appropriate solution, but an expected list of results, too. Farmers can be sure that taking action won’t destroy their entire crop. More importantly, they can use real-time data to ensure damage remains minimal.

#4: Ideal Crops and Consumer Expectation

Let’s say spring and early summer are just over the horizon. Naturally, this is when the strawberry season kicks off — alongside many other crops. Except, over the coming year, the demand for strawberries is vastly lower than in previous seasons.

Rather than filling up an entire plot with strawberries, farmers can account for the lowered demand. This can be true in the opposite direction, too, when demands are higher. Big data enables this at a more advanced level than ever before.

Farmers can see precisely how much they produced in year’s past, what that meant for customer impact, how this affected supply and demand and even tips for ways to improve their operations. They could cut excess waste by producing fewer crops for a lower demand season, for instance, to save both money and space to grow alternatives.

#5: Data-Driven Industry

The other proponent of big data is that the systems are synced up with external platforms for a considerable amount of data and insights. It ties into the whole “connected” and smart side of technology.

Machine learning and algorithmic tools can be designed to factor in any number of external insights or information. Farmers can then use predictive modeling techniques to plan or act accordingly — think weather patterns, consumer demands and trends and even historical industry events. This data will help those in the agriculture industry to understand how the surrounding world affects their business.

What should they plant? When is the best time? What earnings can they expect? Are the prices of supplies rising, and how does this affect profits?

This all works to create a collaborative, data-driven industry that operates in new, innovative ways as opposed to following strategies used in the past. The beauty of this is that we don’t have to eliminate legacy strategies to make room for data-driven solutions. In fact, we can combine it all to create one of the most effective, successful operations ever to exist.

Read the source article in RT Insights.

Executive Interview: Dr. Russell Greiner, Professor CS and founding Scientific Director of the Alberta Machine Intelligence Institute

After earning a PhD from Stanford, Russ Greiner worked in both academic and industrial research before settling at the University of Alberta, where he is now a Professor in Computing Science and the founding Scientific Director of the Alberta Innovates Centre for Machine Learning (now Alberta Machine Intelligence Institute), which won the ASTech Award for “Outstanding Leadership in Technology” in […]

After earning a PhD from Stanford, Russ Greiner worked in both academic and industrial research before settling at the University of Alberta, where he is now a Professor in Computing Science and the founding Scientific Director of the Alberta Innovates Centre for Machine Learning (now Alberta Machine Intelligence Institute), which won the ASTech Award for “Outstanding Leadership in Technology” in 2006. He has been Program Chair for the 2004 “Int’l Conf. on Machine Learning”, Conference Chair for 2006 “Int’l Conf. on Machine Learning”, Editor-in-Chief for “Computational Intelligence”, and is serving on the editorial boards of a number of other journals. He was elected a Fellow of the AAAI (Association for the Advancement of Artificial Intelligence) in 2007, and was awarded a McCalla Professorship in 2005-06 and a Killam Annual Professorship in 2007. He has published over 200 refereed papers and patents, most in the areas of machine learning and knowledge representation, including 4 that have been awarded Best Paper prizes. The main foci of his current work are (1) bioinformatics and medical informatics; (2) learning and using effective probabilistic models and (3) formal foundations of learnability. He recently spoke with AI Trends.

Dr. Russell Greiner, Professor in Computing Science and founding Scientific Director of the Alberta Machine Intelligence Institute

Q: Who do you collaborate with in your work?
I work with many very talented medical researchers and clinicians, on projects that range from psychiatric disorders, to stroke diagnosis, to diabetes management, to transplantation, to oncology, everything from breast cancer to brain tumors. And others — I get many cold-calls from yet other researchers who have heard about this “Artificial Intelligence” field, and want to explore whether this technology can help them on their task.

Q: How do you see AI playing a role in the fields of oncology, metabolic disease, and neuroscience?

There’s a lot of excitement right now for machine learning (a subfield of Artificial Intelligence) in general, and especially in medicine, largely due to its many recent successes.  These wins are partly because we now have large data sets, including lots of patients — in some cases, thousands, or even millions of individuals, each described using clinical features, and perhaps genomics and metabolomics data, or even neurological information and imaging data. As these are historical patients, we know which of these patients did well with a specific treatment and which ones did not.  

I’m very interested in applying supervised machine learning techniques to find patterns in such datasets, to produce models that can make accurate predictions about future patients. This is very general — this approach can produce models that can be used to diagnose, or screen novel subjects, or to identify the best treatment — across a wide range of diseases.

It’s important to contrast this approach with other ways to analyze such data sets. The field of biostatistics includes many interesting techniques to find “biomarkers” — single features that are correlated with the outcomes — as a way to try to understand the etiology, trying to find the causes of the disease. This is very interesting, very relevant, very useful. But it does not directly lead to models that can decide how to treat Mr. Smith when he comes in with his particular symptoms.  

At a high level: I’m exploring ways to find personalized treatments — identifying the treatment that is best for each individual. These treatment decisions are based on evidence-based models, as they are learned from historical cases — that is, where there is evidence that the model will work effectively.

In more detail, our team has found patterns in neurological imaging, such as functional MRI scans, to determine who has a psychiatric disorder — here, for ADHD, or autism, or schizophrenia, or depression, or Alzheimer’s disease.

Another body of work has looked at how brain tumors will grow by looking at brain scans of people, using standard structural MRI imaging.  Other projects learn screening models that determine which people have adenoma (from urine metabolites), or models that predict which liver patients will most benefit from a liver transplant (from clinical features), or which cancer patients will have cachexia, etc.

Q: How can machine learning be useful in the field of Metabolomics?

Machine learning can be very useful here. Metabolomics has relied on technologies like mass spec and NMR spectroscopy to identify and quantify small molecules in a biofluid (like blood or urine); this previously was done in a very labor-intensive way, by skilled spectroscopists.

My collaborator, Dr. Dave Wishart (here at the University of Alberta) and some of our students, have designed tools to automate this process — that can effectively find the molecules present  in say blood. This means metabolic profiling is now high-throughput and automated, making it relatively easy to produce datasets that include the metabolic profiles from a set of patients, along with their outcome.  Machine learning tools can then use this labeled dataset to produce models for predicting who has a disease, for screening or for diagnosis. This has led to models that can detect cachexia (muscle wasting) and adenoma (with a local company, MTI).

Q: Can you go in to some detail on the work you have done designing algorithms to predict patient-specific survival times?

This is my current passion; I’m very excited about it.

The challenge is building models that can predict the time until an event will happen — for example, given a description of a patient with some specific disease, predict the time until his death (that is, how long he will live).  This seems very similar to the task of regression, which also tries to predict a real value for each instance –for example, predicting the price of a house based on its location, the number of rooms, and their sizes, etc.. Or given a description of a kidney patient (age, height, BMI, urine metabolic profile, etc.), predict the glomerular filtration rate of that patient, a day later.

Survival prediction looks very similar because both try to predict a number for each instance. For example, I describe a patient by his age, gender, height, and weight, and his genetic information, and metabolic information, and now I want to predict how long until his death — which is a real number.  

The survival analysis task is more challenging due to “censoring”.  To explain, consider a 5 year study that began in 1990. Over these five years, many patients have passed away, including some who lived for three years, others for 2.7 years, or 4.9 years. But many patients didn’t pass away during these 5 years –which is a good thing… I’m delighted these people haven’t died! But this makes the analysis much harder: for the  many patients alive at the end of the study, we know only that they lived at least 5 years, but we don’t know if they lived 5 years and a day or lived 30 years — we don’t know and never will know.

This makes the problem completely different from the standard regression tasks. The tools that work for predicting glomerular filtration rate or for predicting the price of a house just don’t apply here. You have to find other techniques.  Fortunately, the field of survival analysis provides many relevant tools. Some tools predict something called “risk”, which gives a number to each patient, with the understanding that this tool is predicting that patients with higher risks will die before those with lower risk. So if Mr A’s risk for cancer is 7.2 and Mr B’s is 6.3 — that is, Mr A has a higher risk — this model predicts that Mr Awill die of cancer before Mr B will. But does this mean that Mr A will die 3 days before Mr B, or 10 years — the risk score doesn’t say.

Let me give a slightly different way to use this. Recall that Mr A’s risk of dying of cancer is 7.2.  There are many websites that can do “what if” analysis: perhaps if he stops smoking, his risk reduces to 5.1.  This is better, but by how much? Will this add 2 more months to his life, or 20 years? Is this change worth the challenge of not smoking?

Other survival analysis tools predict probabilities — perhaps Ms C’s chance of 5-year disease-free survival, is currently is 65%. but if she changes her diet in certain way, this chance goes up to 78%. Of course, she wants to increase her five-year survival. But again, this is not as tangible as learning, “If I continue my current lifestyle then this tool predicts I will develop cancer in 12 years, but if I stop smoking, it goes from 12 to 30 years”.  I think this is much more tangible, and hence will be more effective in motivating people to change their lifestyle, versus changing their risk, or their 5-year survival probability.

So my team and I have provided a tool that do exactly that, by giving each person his or her individualized survival curve, which shows that person’s expected time to event. I think that will help motivate people to change their lifestyle. In addition, my colleagues and I also applied this to a liver transplant dataset, to produce a model that can determine which patient with end-stage liver failure, will benefit the most from a new liver, and so should be added to the waitlist.

Those examples all deal with time to death, but in general, survival analysis can deal with time to event, for any event. So it can be used to model a patient’s expected time to re-admission.   Here, we can seek a model that, given a description of a patient being discharged from a hospital, can predict when that patient will be readmitted — eg, if she will return to the hospital, for the same problem, soon or not.

Imagine this tool predicted that, given Ms Jones’ current status, if she leaves the hospital today, she will return within a week.   But if we keep her one more day and give some specific medications, we then predict her readmission time is 3 years. Here, it’s probably better to keep her that one more day and give one more medication. It will help the patient, and will also reduce costs.

Q: What do you see are the challenges ahead for the healthcare space in adopting machine learning and AI?

There are two questions: what machine learning can do effectively, and what it should do.

The second involves a wide range of topics, including social, political, and legal issues. Can any diagnostician — human or machine — be perfect? If not, what are the tradeoffs?  How to verify the quality of a computer’s predictions? If it makes a mistake, who is accountable? The learning system? Its designer? The data on which it was trained? Under what conditions should a learned system be accepted? … and eventually incorporated into standard of care?  Does the program need to be ‘‘convincing”, in the sense of being able to explain its reasoning — that is, explain why it asked for some specific bit of information? … or why it reached a particular conclusion? While I do think about these topics, I am not an expert here.

My interest is more in figuring what these systems can do — how accurate and comprehensive can they be? This requires getting bigger data sets — which is happening as we speak. And defining the tasks precisely — is the goal to produce a treatment policy that works in Alberta, or that works for any patient, anywhere in the world?  This helps determine the diversity of training data that is required, as well as the number of instances. (Hint: building an Alberta-only model is much easier than a universal one.) A related issue is defining exactly what the learned tool should do: In general, the learned performance system will return a “label” for each patient — which might be a diagnosis (eg, does the patient have ADHD), or a specific treatment (eg, give a SSRI [that is, a selective serotonin reuptake inhibitor]). Many clinicians assume the goal is a tool that does what they do. That would be great if there was an objective answer, and the doctor was perfect, but this is rarely the case.  First, in many situations, there is significantly disagreement between clinicians (eg, some doctors may think that a specific patient has ADHD, while others may disagree) — if so, which clinician should the tool attempt to emulate? It would be better if the label instead was some objective outcome — such as “3 year disease-free survival’’, or “progression within 1 year” (where there is an objective measure for “progression”, etc.)

This can get more complicated when the label is the best treatment — for example, given a description of the patient, determine whether that patient should get drug-A or drug-B. (That is, the task is prognostic, not diagnostic.)  While it is relatively easy to ask the clinician what she would do, for each patient, recall that clinicians may have different treatment preferences… and those preferences might not lead to the best outcome. This is why we advocate, instead, first defining what “best” means, by having a well-defined objective score for evaluating a patient’s status, post treatment.  We then define the goal of the learned performance system as finding the treatment, for each patient, that optimizes that score.

One issue here is articulating this difference, between “doing what I do” versus optimizing an objective function.  A follow-up challenge is determining this objective scoring function, as it may involve trading off, say, treatment efficacy with side-effects, etc. Fortunately, clinicians are very smart, and typically get it!  We are making in-roads.

Of course, after understanding and defining this objective scoring function, there are other challenges — including collecting data from a sufficient number of patients and possibly controls, from the appropriate distributions, then building a model from that data, and validating it, perhaps on another dataset.  Fortunately, there are an increasing number of available datasets, covering a wide variety of diseases, with subjects (cases and controls) described with a many different types of features (clinical, omics, imaging, etc etc etc). Finally comes the standard machine learning challenge of producing a model from that labeled data.  Here, too, the future is bright: There are faster machines, and more importantly, I have many brilliant colleagues developing ingenious new algorithms, to deal with many different types of information.

All told, this is a great time to be in this important field!  I’m excited to be a part of it.

Thank you Dr. Greiner!

Learn more at the Alberta Machine Intelligence Institute.

Pre-built Analytic Modules Will Drive AI Revolution in Industry

By Bill Schmarzo, CTO, Big Data Practice of EMC Global Services What is the Intelligence Revolution equivalent to the 1/4” bolt? I asked this question in the blog “How History Can Prepare Us for Upcoming AI Revolution?” when trying to understand what history can teach us about technology-induced revolutions.  One of the key capabilities of the […]

By Bill Schmarzo, CTO, Big Data Practice of EMC Global Services

What is the Intelligence Revolution equivalent to the 1/4” bolt?

I asked this question in the blog “How History Can Prepare Us for Upcoming AI Revolution?” when trying to understand what history can teach us about technology-induced revolutions.  One of the key capabilities of the Industrial and Information revolutions was the transition from labor-intensive, hand-crafted to mass manufactured solutions.  In the Information Revolution, it was the creation of standardized database management systems, middleware and operating systems.  For the Industrial Revolution, it was the creation of standardized parts – like the ¼” bolt – that could be used to assemble versus hand-craft solutions. So, what is the ¼” bolt equivalent for the AI Revolution?  I think the answer is Analytic engines or modules!

Analytic Modules are pre-built engines – think Lego blocks – that can be assembled to create specific business and operational applications.  These Analytics Modules would have the following characteristics:

  • pre-defined data input definitions and data dictionary (so it knows what type of data it is ingesting, regardless of the origin of the source system).
  • pre-defined data integration and transformation algorithms to cleanse, align and normalize the data.
  • pre-defined data enrichment algorithms to create higher-order metrics (e.g., reach, frequency, recency, indices, scores) necessitated by the analytic model.
  • algorithmic models (built using advanced analytics such as predictive analytics, machine learning or deep learning) that takes the transformed and enriched data, runs the algorithmic model and generates the desired outputs.
  • layer of abstraction (maybe using Predictive Model Markup Language or PMML[1]) above the Predictive Analytics, Machine Learning and Deep Learning frameworks that allows application developers to pick/use their preferred or company mandated standards.
  • orchestration capability to “call” the most appropriate machine learning or deep learning framework based upon the type of problem being addressed. See Keras, which is a high-level neural networks API, written in Python and capable of running on top of popular machine learning frameworks such as TensorFlow, CNTK, or Theano.
  • pre-defined outputs (API’s) that feeds the analytic results to the downstream operational systems (e.g., operational dashboards, manufacturing, procurement, marketing, sales, support, services, finance).
  • Analytic Modules produce pre-defined analytic results or outcomes, while providing a layer of abstract that enables the orchestration and optimization of the underlying machine learning and deep learning frameworks.
    Monetizing IOT with Analytic Modules

    The BCG Insights report titled “Winning in IoT: It’s All About the Business Processes” highlighted the top 10 IoT use cases that will drive IoT spending including predictive maintenance, self-optimized production, automated inventory management, fleet management and distributed generation and storage (see Figure 1).

    But these IoT applications will be more than just reports and dashboards that monitor what is happening. They’ll be “intelligent” – learning with every interaction to predict what’s likely to happen and prescribe corrective action to prevent costly, undesirable and/or dangerous situations – and the foundation for an organization’s self-monitoring, self-diagnosing, self-correcting and self-learningtwoIoT environment.

    While this is a very attractive list of IoT applications to target, treating any of these use cases as a single application is a huge mistake.  It’s like the return of the big bang IT projects of ERP, MRP and CRM days, where tens of millions of dollars are spent in hopes that two to three years later, something of value materializes.

    Instead, these IoT “intelligent” applications will be comprised of analytic modules integrated to address the key business and operational decisions that these IoT intelligent applications need to address.  For example, think of Predictive maintenance as comprised of an assembly of analytic modules addressing the following predictive maintenance decisions including:

    • identifying At-risk component failure prediction.
    • optimizing resource scheduling and staffing.
    • matching Technician and Inventory to the maintenance and repair work to be done.
    • ensuring tools and repair equipment availability.
    • ensuring First-time-fix optimization.
    • optimizing Parts and MRO inventory.
    • predicting Component fixability.
    • optimizing the Logistics of parts, tools and technicians.
    • leveraging Cohorts analysis to improve service and repair predictability.
    • leveraging Event association analysis to determine how weather, economic and special events impact device and machine maintenance and repair needs.

    As I covered in the blog “The Future Is Intelligent Apps,” the only way to create intelligent applications is to have a methodical approach that starts the predictive maintenance hypothesis development process with the identification, validation, valuing and prioritizing of the decisions (or use cases) that comprise these intelligent applications.

    Read the source article in Data Science Central.

Data Science on a Budget: Audubon’s Advanced Analytics

On Memorial Day weekend 2038, when your grandchildren visit the California coast, will they be able to spot a black bird with a long orange beak called the Black Oystercatcher? Or will that bird be long gone? Will your grandchildren only be able to see that bird in a picture in a book or on […]

On Memorial Day weekend 2038, when your grandchildren visit the California coast, will they be able to spot a black bird with a long orange beak called the Black Oystercatcher? Or will that bird be long gone? Will your grandchildren only be able to see that bird in a picture in a book or on a website?

A couple of data scientists at the National Audubon Society have been examining the question of how climate change will impact where birds live in the future, and the Black Oystercatcher has been identified as a “priority” bird — one whose range is likely to be impacted by climate change.

How did Audubon determine this? It’s a classic data science problem.

First, consider birdwatching itself, which is pretty much good old-fashioned data collection. Hobbyists go out into the field, identify birds by species and gender and sometimes age, and record their observations on their bird lists or bird books, and more recently on their smartphone apps.

Audubon itself has sponsored an annual crowdsourced data collection event for more than a century — the Audubon Christmas Bird Count — providing the organization with an enormous dataset of bird species and their populations in geographies across the country at specific points in time. The event is 118 years old and one of the longest data sets for birds in the world

That’s one of the data sets that Audubon used in its project that looks at the impact of climate change on bird species’ geographical ranges, according to Chad Wilsey, director of conservation science at Audubon. He spoke with InformationWeek in an interview. Wilsey is an ecologist, and not trained as a data scientist. But like many scientists, he uses data science as part of his work. In this case, as part of a team of two ecologists, he applied statistical modeling using technologies such as R to multiple data sets to create the predictive models for future geographical ranges for specific bird species. The results are published in the 2014 report, Audubon’s Birds and Climate Change. Audubon also published interactive ArcGIS maps of species and ranges to its website.

The initial report used Audubon’s Christmas bird count data set and the North American Breeding Bird Survey from the US government. The report assessed geographic range shifts through the end of the century for 588 North American bird species during both the summer and winter seasons under a range of future climate change scenarios. Wilsey’s team built models based on climatic variables such as historical monthly temperature and precipitation averages and totals. The team built models using boosted regression trees and machine learning. These models were built with bird observations and climate data from 2000 to 2009 and then evaluated with data from 1980 to 1999.

“We write all our own scripts,” Wilsey told me. “We work in R. It is all machine learning algorithms to build these statistical models. We were using very traditional data science models.”

Audubon did all this work on an on-premises server with 16-CPUs and 128 gigabytes of RAM.

Read the source article in InformationWeek.com.

5 Tips to Turn Your Data to Your Competitive Advantage

The need for competitive advantage sees companies increasingly turning to analytics to operationalize their data. Leveraging analytics from insight to artificial intelligence (AI), business leaders can make sense of their rapidly-growing piles of data to improve their operations. Here are my tips for using analytics to create a measurable business impact. #1: Put the decision […]

The need for competitive advantage sees companies increasingly turning to analytics to operationalize their data. Leveraging analytics from insight to artificial intelligence (AI), business leaders can make sense of their rapidly-growing piles of data to improve their operations. Here are my tips for using analytics to create a measurable business impact.

#1: Put the decision before the data

With a decision-first strategy you define the business objective first, then determine what data and analytics you need to achieve the goal. Extrapolating insights from huge amounts of data can be interesting, but it can also be a tremendous waste of time and resources if it doesn’t solve a specific business challenge. If the modeling and data analytics requirements are defined by the business outcome first, data exploration and analytic development is faster and more productive. This helps enterprises narrow in on meaningful outcomes, shutting out extraneous noise and focus on the insights that address specific objectives.

#2: Get data into decision makers’ hands

Empower the business leaders with the ability to evaluate the complete spectrum of potential opportunities. This requires a combination of insight, advanced analytics and decisioning (prescriptive) to explore, simulate and pressure test scenarios in real-time. To do this, you need user-friendly decision management tools that can be rapidly configured and evolve with the specific needs of the operation. Experience has shown that when business experts have access to the data, insight, and tools to exploit analytics, they can visualize relationships between different variables and actions to quickly identify the preferred outcomes for maximum impact.

#3 AI & machine learning can expand your frontiers

Every decision that is made or action that is taken provides an opportunity to improve. The key is automatically feeding those learnings back into the analytic system to influence the next decision or action. By using decision management tools that incorporate machine learning and artificial intelligence, enterprises can conduct complex analysis that evolves and improves as new scenarios are added. With artificial intelligence and machine learning, you can discover unique insights and meaningful patterns in large volumes of data. Then, add self-learning models that will allow you to adapt quickly to changes in those patterns or take action on those insights. But, to unlock the full business potential, the analytic output must be explainable to a business expert if it is to be understood and accepted.

#4: Keep it open and focus on integration

One of the easiest ways to start a healthy debate about analytics is to pose the question of which tool is the best! The reality is that it depends on what you are trying to accomplish, but even more so who is accomplishing it. One thing is for sure: you most certainly are not starting from scratch and already have technology systems in place. Be certain any further investment is in analytic and decision management tools that are open and can easily integrate with your existing environment. However, the key requirement is to understand how you will eventually use and manage those analytics within your day-to-day operation.

#5 Operationalize the analytics

The real value of analytics comes when they are operationalized. Connecting the data and insights gleaned from advanced analytics to day-to-day operations will tie to positive business outcomes. With prescriptive analytics, you can add business rules or optimization models to the analytics – which will trigger a specific action to be taken in different scenarios based on a deep understanding of the situation, predictions about the future, and other business constraints or regulations.

With these suggestions in mind, business leaders can help move their enterprise to a place where artificial intelligence and human intelligence come together to drive real business outcomes that deliver competitive advantage and better differentiation.

Read the source article in RTInsights.com.

Startup Using Artificial Intelligence To Guide Earthquake Response

A startup company in California is using machine learning and artificial intelligence to advise fire departments about how to plan for earthquakes and respond to them. (Photo above shows first responders in the Marina District disaster zone after an earthquake on October 17, 1989 in San Francisco, Calif. The company, One Concern, hopes its algorithms […]

A startup company in California is using machine learning and artificial intelligence to advise fire departments about how to plan for earthquakes and respond to them. (Photo above shows first responders in the Marina District disaster zone after an earthquake on October 17, 1989 in San Francisco, Calif.

The company, One Concern, hopes its algorithms can take a lot of the guesswork out of the planning process for disaster response by making accurate predictions about earthquake damage. It’s one of a handful of companies rolling out artificial intelligence and machine learning systems that could help predict and respond to floods, cyber-attacks and other large-scale disasters.

Nicole Hu, One Concern’s chief technology officer, says the key is to feed the computers three main categories of data.

The first is data about homes and other buildings, such as what materials they’re made of, when they were built and how likely they are to collapse when the ground starts shaking.

The next category is data about the natural environment. For example, “What is the soil like? What is the elevation like? What is the general humidity like?” explains Hu.

“The third thing we look at is live instant data,” she says, such as the magnitude of the quake, the traffic in the area of the quake and the weather at the time of the quake.

The computer uses the information to make predictions about what would happen if an earthquake occurred in a particular area. It then uses data from past earthquakes to see whether its predictions are any good, and revises its predictive models accordingly.

In other words, it learns as it goes, which is basically how machine learning works.

Stanford University earthquake engineer Gregory Deierlein consulted for One Concern. He says one of the most remarkable things about the company’s software is its ability to incorporate data from an earthquake as it’s happening, and to adjust its predictions in real time.

“Those sort of things used to be research projects,”says Deierlein. “After an event, we would collect data and a few years later we’d produce new models.”

Now the new models appear in a matter of minutes.

He notes the company’s exact methods are opaque. “Like many startup companies they’re not fully transparent in everything they’re doing,” he says. “I mean, that’s their proprietary knowledge that they’re bringing to it.”

Nonetheless, some first responders are already convinced the software will be useful.

Fire chief Dan Ghiorso leads the Woodside Fire Protection District near San Francisco, which covers about 32 square miles. The San Andreas fault is only a couple hundred feet behind the firehouse.

Ghiorso says in the past, when an earthquake hit, he’d have to make educated guesses about what parts of his district might have suffered the most damage, and then drive to each place to make a visual inspection. He hopes One Concern’s software will change that, although he has yet to put it to the test during an actual quake.

“Instead of driving thirty-two square miles, in fifteen minutes on a computer I can get a good idea of the concerns,” he says. “Instead of me, taking my educated guess, they’re putting science behind it, so I’m very confident.”

Unfortunately, it’s going to take a natural disaster to see if his confidence is justified.

Read the source article at Kamu, the educational broadcast services department of Texas A&M University. 

Industrial IoT Analytics Moving Into Prime Time

Implementing an Internet of Things (IoT) program isn’t exactly like flipping a switch. There’s a lot involved, from sensors where the data is initially collected to the network the data travels to the analytics systems that figure out what it all means. So while we’ve all been talking about IoT for a few years now, […]

Implementing an Internet of Things (IoT) program isn’t exactly like flipping a switch. There’s a lot involved, from sensors where the data is initially collected to the network the data travels to the analytics systems that figure out what it all means. So while we’ve all been talking about IoT for a few years now, it’s still considered an emerging technology. But that might be about to change.

Forrester Research has predicted that 2018 is the year that IoT will move from experimentation to business scale.

Heeding the call, a few analytics vendors are getting on the bandwagon with formal divisions or product offerings.  For instance, in January, SAS announced a new IoT division. And this week Splunk announced its first technology specifically for the IoT market, Splunk Industrial Asset Intelligence. The solution is designed to help organizations in manufacturing, oil and gas, transportation, energy and utilities, to monitor and analyze industrial IoT data in real time to create a simple view of complex industrial systems while helping to minimize asset downtime, according to the company’s formal announcement.

As a specialist in machine data analytics, IoT was a natural extension for Splunk. IDC Analyst and Program VP Maureen Fleming told InformationWeek in an interview that it is also something Splunk’s customers have been requesting. Those customers were already trying to solve some of their IoT challenges using Splunk’s existing offerings, she told me. The existing customer need, along with Splunk’s existing expertise in machine data analytics really converged to drive Splunk’s IoT launch right now, Fleming said.

Wind, Power, and Data

One of those customers is Australia’s Infigen Energy which develops, owns, and operates wind farms to power businesses in the country. Information and Application Architect Victor Sanchez told InformationWeek that Splunk IAI has made it easier to troubleshoot issues with the Infigen’s legacy automated SCADA control system.

The company first deployed Splunk Enterprise in a pilot in 2014, and since then the system has evolved from simple monitoring to a platform for ingesting data from all the company’s turbines and other equipment. Now Infigen is starting to build its first machine learning models and correlating more data from different levels of the business, from technical to operational, Sanchez said.

Infigen’s implementation uses a translation box deploying Kepware with Splunk’s Industrial Data forwarder enable the ingestion of industrial data into Splunk. Sanchez said the company is expanding the size of its on-premises distributed Splunk cluster to future-proof the system and prevent bottlenecks.

The system has provided better visibility of this key data to all employees. The company also uses the mobile Splunk app to enable alerts and data access on the go.

Seema Haji joined Splunk about 10 months ago to help the company launch the technology as director of product marketing for IoT and business analytics. She told InformationWeek in an interview that many of Splunk’s existing customers have been looking for a way to work with their IoT data. Many had been using Excel to bring data in and analyze it. Splunk has set these customers up with a limited availability version of the new IoT analytics offering.

Read the source article in InformationWeek.com.

Southwest Airlines Chose an Analytics Project with Big Impact

Everyone wants their analytics investment to yield successful business results, but not everyone can show the kind of serious return on investment that captures the attention of C-suite executives. So when Southwest Airlines did a pilot program to look at the potential impact of investing in an analytics package, Doug Gray, director of analytical data services […]

Everyone wants their analytics investment to yield successful business results, but not everyone can show the kind of serious return on investment that captures the attention of C-suite executives. So when Southwest Airlines did a pilot program to look at the potential impact of investing in an analytics package, Doug Gray, director of analytical data services at the company, knew he needed to choose something with a big impact.

Of course, analytics work was nothing new to Southwest Airlines. The company had been working with analytics projects for about 20 years already. Such projects are absolutely necessary to manage the complexity of airline operations efficiently and cost effectively.

But with that complexity, there’s quite a lot of room to find more cost savings and efficiencies.

Airlines need their flights to depart and arrive on time. They need to manage a fleet of aircraft and have those aircraft be in the right places at the right times for their flights. They need to forecast customer demand for flights so that they operate that fleet of aircraft efficiently and profitably. They need to manage their crew member scheduling and know their crew member whereabouts for scheduling.

In the case of Southwest Airlines, that meant managing and forecasting 700 aircraft flying about 4,000 flights per day to over 100 national and international destinations.

Gray said Southwest Airlines uses over 100 terabytes of data and dozens of applications that involve analytics already, and when it looks to create a new application, it goes for the high value targets — the project has to yield millions of dollars in business value.

So for its pilot to test a new analytics platform and toolset in 2016, Gray’s group chose something that was the second largest expense for the company — fuel costs. Gray recounted the story of the project at the Gartner Data and Analytics Summit in March.

Depending on market pricing at any given time, every year Southwest Airlines spends between $4 billion and $6 billion dollars on fuel. That means any small percentage improvement in those costs will amount to a huge number.

Before the pilot began, Southwest Airlines’ fuel cost forecasting efforts relied on pulling information from multiple systems including Ariba, the Allegro fuel management environment, and then the company’s own enterprise data warehouse for historical data. Plus, Gray admitted, the company also had a lot of spreadsheets that the team used to store and manipulate quite a lot of data.

“We were feeding all that data into one big massive spreadsheet,” Gray said. With 100 airports served and running a rolling schedule for forecasting fuel every month of the year, the team was producing 1,200 fuel demand forecasts every month. It took one of the finance analysts three days each month to go through the process of generating forecasts that generally weren’t as accurate as the company would have preferred. So there was a lot of room for improvement.

The fuel consumption pilot project used Alteryx Designer, the platform’s gallery, and R, to build eight different predictive models that included time series regression modeling and neural networks, Gray said. For each month and each airport, the system was able to generate 9,600 forecasts. Gray  couldn’t share the actual cost savings in dollar amounts publicly, but he said they were substantial. Benefits included reducing the amount of time required for data wrangling by 60%. Forecast accuracy improved. And Southwest learned it could negotiate a better deal on fuel if it gave all the business to a single vendor in Southern California rather than buying from multiple vendors there. The change also sped up the process of fuel purchasing.

Read the source article at InformationWeek.com.

Predictions of Innovations and Trends for Embedded Analytics for 2018

The embedded analytics market is growing at a rapid rate and will be worth $51.78 billion by 2022. Over the next few years, most organizations will begin to transform their traditional analytical techniques for analyzing business data to more advanced techniques using embedded analytics. If they don’t, they may risk getting left behind their competition. As soon […]

The embedded analytics market is growing at a rapid rate and will be worth $51.78 billion by 2022. Over the next few years, most organizations will begin to transform their traditional analytical techniques for analyzing business data to more advanced techniques using embedded analytics. If they don’t, they may risk getting left behind their competition.

As soon as this year, with the aid of innovations in embedded analytics and business intelligence (BI), data analytics will begin to be much more accessible to professionals with various backgrounds and specialties across an organization. Analytics and business intelligence will no longer be accessible to analysts alone, and professionals across organizations will be empowered to make data-driven decisions with embedded analytics tools that are easier to use than current tools. In fact, Gartner predicts that 80% of organizations will work to increase data literacy across their workforces over the next few years as innovations in embedded analytics and embedded BI continue to surface with innovations that make for a more user-friendly experience.

Below are the innovations and trends for embedded analytics you’ll want to watch for in 2018, organized by structure and capabilities, and also by features.

Structure and Capabilities

In 2018, embedded analytics will own a more decentralized structure with robust integration capabilities, more options in the cloud, and adaptive security. It will also be easier for users of various backgrounds across an organization to use.

Various Hybrid Cloud Options

As the amount of data that organizations use continues to grow at alarming rates, businesses will consider moving some archivable data out of the cloud and into on-premises databases for backup and security purposes. Platform capabilities for accessing, integrating, transforming, and loading data into a self-contained performance engine with the ability to index, manage, and refresh data loads (self-contained ETL and data storage) will be extremely important. In addition, many organizations will opt for multi-cloud strategies where they have data in multiple cloud locations to help with data security, costs, and performance. Overall, in 2018, there will be a variety of either multi-cloud or on-premises hybrid options (or a mixture of both).

Decentralized Analytics’ Evolution to Governed Data Discovery

As embedded analytics begin to be easier to understand for users across an organization, analytics will become decentralized and data discovery will increase but need to be better governed in 2018. Self-service analytics platforms allow for decentralized workflows and for users to prepare their own data analysis with much less reliance on the IT department. As decentralized analytics becomes more prevalent across an organization, the risk of multiple sources of the truth (of the data) and the integrity of data insight will become a serious concern. So, successful organizations will have to evolve their decentralized analytics to an approach that embraces governed data discovery over time. Governed data discovery will permit individuals across an organization to analyze data for their own use cases, but will require that data analyses and insights still fit within the governed parameters of the organization’s business procedures, policies and objectives.

Read the source article at Innovation Enterprise.