/* ---- Google Analytics Code Below */
Showing posts with label Reliability. Show all posts
Showing posts with label Reliability. Show all posts

Wednesday, June 21, 2023

IBM Claims Major Breakthrough in Quantum Computing Reliability

Claims have been many.

IBM Claims Major Breakthrough in Quantum Computing Reliability  in Extremetech

Quantum computers are fast, but they produce a lot of wrong answers. IBM says its error mitigation technique could fix that.

By Ryan Whitwam,  June 15, 2023

Today's most powerful supercomputers can simulate complex weather patterns and the birth of stars, but even a modest quantum computer could outperform those machines. The mysterious nature of quantum mechanics has thus far made quantum computing little more than a curiosity, but IBM is claiming a significant breakthrough. According to IBM's Jay Gambetta, we've reached "the era of utility."

Quantum computers are a hot topic for research, but they haven't been useful for doing calculations yet. We keep trying because of the incredible potential of quantum computing, which takes advantage of weird quantum properties like entanglement, interference, and superposition to accelerate calculations. For example, the digital computers we've used for decades use transistors to signify a 1 or a 0. Using superposition, a quantum bit (qubit) can be a 1, 0, or both. Holding multiple values allows a qubit to perform multiple calculations, whereas a classical computer must do them individually.

These systems are incredibly fast, true, but they don't produce consistent results—at least until now, claims the study published in Nature. Google claimed quantum supremacy in 2019, producing calculations it says would have taken thousands of years on a digital computer. However, a follow-up analysis has shown that a conventional computer can do the same tabulations if given a little more time. IBM's claim is not about speed as much as it's about reliability. For a quantum computer to be useful, it needs to give the same answer every time, and IBM took a big step in that direction with a demonstration of error mitigation.

The team created a simulation of 127-atom bar magnets with a 127-qubit computer, a system known as an Ising model that is often used to study magnetism. At that scale, the magnets are affected by quantum factors, making it impossible to simulate on a classical computer accurately. The researchers leveraged quantum interference to nudge the results, pushing them farther from the solution. By introducing noise into the calculations, the IBM Quantum team could understand the effects of noise in the simulation, working backward to reach the ideal, low-noise result ....'

Monday, March 28, 2022

Auditing Data for Studies

Interesting kind of reliability analysis study.

'Auditing' tool can improve reliability of studies that explore relationships between things

By EKATERINA PESHEVA March 2, 2022 Research

Harvard Medical School News & Research, Ekaterina Pesheva, March 2, 2022

Harvard Medical School (HMS) scientists created the vibration of effects (VoE) auditing tool to improve the reliability of studies that explore the relationships between things. "At its most basic, the vibration of effects model analyzes how the modeling choices a researcher makes can influence what they will discover," explained former HMS researcher Braden Tierney. The tool applies brute-force computation to test the reliability of research findings, and researchers can use it to vet their own results before submission for publication. The tool was used to analyze connections between various gut microbes and six diseases in 15 published studies. High-scoring studies were found to be less reliable because their VoE results exhibited significant variation when run through multiple testing models, while low-scoring studies were found to be more reliable because they identified associations that remained consistent even when processed by different models. ... 

Thursday, October 01, 2020

Outages and their Cause and Consideration

 Been a student of internet outages since our enterprise management asked me:  Yes, its great but what do we do when it goes down?   All computers and networks do.  Here am Just got back into Outlook, and then this came up:

An Outlook email outage affected web mail, desktop, and mobile   By Tom Warren

Microsoft’s Outlook service was down worldwide today, affecting Outlook on the web, Outlook.com, and Outlook on desktop and mobile. The outage started at around 2AM ET, after Microsoft has confirmed it was affecting users worldwide. Outlook users were unable to access their email, and Outlook.com was failing to load for around four hours.

Microsoft says a recent configuration change was to blame for the issues. “We’ve determined that a recent configuration update to components that route user requests was the cause of impact,” says Microsoft in a service message on Twitter. “We’ve reverted the update and are monitoring the service for recovery.”  ... " 

Wednesday, October 16, 2019

Learning, Reliability in Complex Systems

Useful piece in Infoq .... learning is hard because we need to continually put it into context for ourselves, goals. Plus the ultimate implications of reliability after they have learned something new.

Jason Hand explores the challenges with learning in complex systems, the relationship between high and low stakes learning opportunities as well as the cost associated.  In InfoQ

Bio: Jason Hand is Senior Cloud Advocate at Microsoft. He writes, presents, and coaches on the principles & nuance of DevOps, Site Reliability Engineering, and modern incident management practices. Named “DevOps Evangelist of the Year” in 2016, he recently authored a book on the topic of Site Reliability Engineering. He is a co-host on the podcast “Community Pulse”- a show on building community in tech

Hand: One of the things I want to talk about today, or the main theme for today is that I hope that I challenge some of the thinking that you've maybe been resting in for a long time in how we build, and operate, and maintain our systems. A lot of the ideas I want to share today came from the travels that I've been doing over the past six or seven months. Back in September, I joined Microsoft, previously I was at a company called VictorOps, which, if you're not familiar with them, they do incident management, on-call management, similar to PagerDuty if you're familiar with that service. I joined Microsoft, and one of the very first things that I was pulled into was what they call Microsoft's Ignite The Tour, which is a global tour. All around the world, there are 17 different cities.

I was pulled into this project, which was actually very interesting, very eye-opening. Over the course of the past 6 months or so, I've traveled almost 92,000 miles, which ended up being a little bit over 3 and a half times around the world, which I never would have thought I would have done in my career. It blows my mind. I've spent a total of over seven days, just in the air alone during that time.

These stats aren't that important, but what is important is that everywhere I go and everybody I talk to, I find that I'm using language like complex systems, and there's not a common understanding of what that actually means. I think - I'm guilty of this - we've fallen into this trap of using words, using terminology that not everyone's on the same page of what it actually means. When I say complex systems, I think sometimes people just accept that as face value and don't actually dig into what does that actually mean when he says complex.

That's a big part of what I want to talk about today, and especially what I want to start with is what do we mean when we're talking about complex systems? Not only that but why - when we're trying to focus on learning so much, we've heard about how it's important to be a learning organization - why are we struggling to actually learn? If you were here for Ryan's [Kitchens] talk previously, he touched on a lot of things that I'm going to also try to amplify a little bit more. It's very difficult for us to really find good methods to learn about our systems, to learn new ways to improve them, and continuously build them and make them better for the world, for everybody that we're trying to serve and trying to make better.  .....


Monday, September 30, 2019

Robotic Reliability vs Reasoning with Transparency

We came up with some similar conclusions.  That observed reliability was typically much more important that people thought, and it was best to solve that problem before aiming at advanced reasoning in a robot, or robotic process.    This further explores the value of transparency in an embedded reasoning  process, especially in human-robot teams.  All essential in a future of such cooperation.

When It Comes to Robots, Reliability May Matter More Than Reasoning
U.S. Army Research Laboratory    September 25, 2019

A study by U.S. Army Research Laboratory (ARL) and University of Central Florida found that human confidence in robots decreases after a robot makes a mistake, even when it is transparent with its reasoning process. The researchers explored human-agent teaming to define how the transparency of the agents, such as robots, unmanned vehicles, or software agents, impacts human trust, task performance, workload, and agent perception. Subjects observing a robot making a mistake downgraded its reliability, even when it did not make any subsequent mistakes. Boosting agent transparency improved participants' trust in the robot, but only when the robot was collecting or filtering data. ARL's Julia Wright said, "Understanding how the robot's behavior influences their human teammates is crucial to the development of effective human-robot teams, as well as the design of interfaces and communication methods between team members."

Monday, June 03, 2019

Seeking Better Alerting

Seems an obvious thing, but the use of alerts has come up in recent interactions as being key to getting things done effectively.  How, How often and Followups.  Interaction with Risk and Trust of the system and its resource components.  Mature Alerts?   Here specifically about site reliability, but broadly useful.  O'Reilly does a good overview of the topic:

Reduce Toil through better Alerting

How SREs can use a hierarchy for mature alerts.
By Štěpán Davidovič, Betsy Beyer  

Check out "The Site Reliability Workbook" for real-world examples of how to put SRE principles and practices to work in your environment.

SRE best practices at Google advocate for building alerts based upon meaningful service-level objectives (SLOs) and service-level indicators (SLIs). In addition to an SRE book chapter, other site reliability engineers at Google have written on the topic of alerting philosophy. However, the nuances of how to structure well-reasoned alerting are varied and contentious. For example, traditional "wisdom" argues that cause-based alerts are bad, while symptom-based or SLO-based alerts are good.

Navigating the dichotomy of symptom-based and cause-based alerting adds undue toil to the process of writing alerts: rather than focusing on writing a meaningful alert that addresses a need for running the system, the dichotomy brings anxiety around deciding whether an alert condition falls on the “correct” side of this dichotomy.

Instead, consider approaching alerting as a hierarchy of the alerts available to you: reactive, symptom-based alerts—typically based on your SLOs—form the foundation of this hierarchy. As systems mature and achieve higher availability targets, other types of alerts can add to your system's overall reliability without adding excessive toil. Using this approach, you can identify value in different types of alerts, while aiming for a comprehensive alerting setup. .... 

As detailed below, by analyzing their existing alerts and organizing them according to a hierarchy, then iterating as appropriate, service owners can improve the reliability of their systems and reduce the toil and overhead associated with traditional cause-based and investigative alerts.  .... "

Sunday, February 04, 2018

Integrated Work System

A conversation with former enterprise colleagues led me to this piece on reliability innovation I had a small part in.    See the tag links to Los Alamos Labs, which was also involved.  Here to pass this on to my readers for reference, worth examining.

Pointer to some of the above work in R&D Magazine.
This was originally offered for use by KPMG Consulting. in 2005.

EY and P&G Alliance  and see also @EY_Alliances
What will operational excellence look like in your organization’s future?

Despite investments in lean manufacturing, Six Sigma and total productivity maintenance, companies are struggling to achieve a breakthrough in manufacturing productivity. EY and P&G have combined their manufacturing excellence capabilities to bring a significantly different approach to attaining higher levels of manufacturing performance.

P&G’s Integrated Work System (IWS) is a proprietary way of improving manufacturing reliability, reducing costs and elevating productivity. IWS is a disruptive way of working predicated on two primary principles: the drive to zero losses and 100 percent employee ownership. The EY and P&G alliance combines the Integrated Work System with the global EY manufacturing performance improvement experience and P&G certified consultants to help clients improve performance via sustainable change and transformation.  ..... 

Newsroom
  
Procter & Gamble (P&G) has recognized EY as one of twelve top performing companies to receive its highest honor of being named Excellence Award winners and External Business Partner of the Year at the 2016 P&G External Business Partner of the Year Recognition Dinner, which recognizes exemplary partner collaboration.

Among P&G’s more than 50,000 suppliers and agencies, EY was the only professional services organization to receive the award. EY has now received the P&G Excellence Award six times and the Partner of the Year award three times.

Kristina Rogers, EY Global Consumer Products and Retail Leader, says:

“We are thrilled to be recognized by P&G for EY’s commitment to collaboration. With the industry facing unprecedented disruption, business innovation is critical to meet changing consumer wants and needs and to sustaining profitable growth. We are proud to be a key member of the P&G innovation ecosystem, challenging thinking and helping to deliver on its consumer promise.”

David Taylor, P&G President and CEO, presented the award to EY for engendering strong trust and collaboration through agility and quality advisory services.

Taylor says: “Supplier partners have a key role to play in our ability to deliver. The more integrated and connected P&G and our supplier partners are, the better able we are to be more innovative and productive in meeting consumers’ needs.”   .... "

Wednesday, February 03, 2016

AI Monitoring Machinery Health

In the CACM:   A favorite topic in manufacturing,  to predict and improve maintenance.  

" ... An artificial intelligence algorithm created by University of Alabama in Huntsville (UAH) principal research scientist Rodrigo Teixeira greatly increases accuracy in diagnosing the health of complex mechanical systems.

"The ability to extract dependable and actionable information from the vibration of machines will allow businesses to keep their assets running for longer while spending far less in maintenance. Also, the investment to get there will be just software," says Teixeira, who is the technical lead for the Health and Usage Monitoring Systems (HUMS) analytics project at UAH's Reliability and Failure Analysis Laboratory (RFAL).

In blind tests using data coming from highly unpredictable and real-life situations, the algorithm consistently achieves over 90 percent accuracy, Teixeira says. ... " 

Wednesday, November 25, 2015

Intel Corp on Device Reliance

Correspondent John Stine on devices and more, This comes to mind as I am about to switch phones and add a new communications channel.

" ... During a separate seminar at that same technology summit, Jon Stine -- Intel's global director of sales and strategy for retail, hospitality and consumer goods -- outlined why retailers are spending so much to perfect their mobile and multichannel shopping strategies.

Americans are spending 9.9 hours per day on screens (mobile phone, TV, personal computer and tablet, among others). Intel data reveal that 60 percent to 65 percent of U.S. consumers begin their shopping by going online.

Still, as much as 90 percent of purchases are made in stores. That means serving customers online and in stores is a must for retailers like Wal-Mart, Stine said.

"Being agile and flexible has never been more important for retailers than right now," he added, noting that the mobile phone has become "the remote control for your daily life."

As we become accustomed to websites that load and refresh in 20 milliseconds, saving time has become as high a priority as saving money.

Brown noted that through its app, and the ability for customers to order and pay without ever interacting with a person, Starbucks has shaved a full minute off the time it usually takes to get in and out with a cup of coffee. That adds up to about 5 million minutes a month being saved.

More and more consumers are demanding that the in-store shopping experience be as fast as it is online. We're looking for what Stine described as the "living, breathing Internet." ... " 

Saturday, November 14, 2015

Fail at Scale

In CACM: Reliability and the science of graceful failure.  Abstract, full article requires registration.

Monday, August 10, 2015

GE Predix Cloud Predicts Machine Failures

We worked on a similar project in collaboration with Los Alamos Labs. In particular to do reliability and failure prediction for systems that were composed of elements that only rarely failed.   So called 'Black Swans'.    This was eventually licensed out via a third party.  Examining the difference between the ideas.   Will report back here with more information.  A related project looked at out of stock condition on a store shelf as a failure that could be predicted by multiple sensory inputs.

Pointer to some of the above work in R&D Magazine.
This was eventually offered for use by KPMG Consulting. in 2005.

In FastCompany:
GE wants to give industrial machines their own social network with Predix Cloud ... GE is selling a new service that promises to predict when a machine will break down, so technicians can preemptively fix it. .... "

Saturday, April 25, 2015

New System Modeler Capabilities: Reliability

Wolfram's new SystemModeler version brings new capabilities, examining.  I like the mention of 'cyber-physical' systems.    Most anything these days can be considered a member of that set.  And has performance, reliability and operational considerations.

Champaign, IL--With today's release of SystemModeler 4.1, Wolframis unveiling the next iteration of features in an easy-to-use modeling and simulation environment for cyber-physical systems. ...

One highlight is the integration of Mathematica's complete suite for reliability analysis. Consumer electronics, satellite systems, and flight systems all have different reasons for valuing reliability. Reliability analysis can show where to concentrate engineering efforts to produce the most reliable products, estimate where failure will happen, and price warranties accordingly.

In all, SystemModeler 4.1 adds or improves over 24 applications, alongside many features. Highlights include:

* SystemModeler capabilities for exporting models extended to importing models
* Import from tools such as Simulink, Flowmaster, and IBM Rational Rhapsody enabled based on the FMI standard
* Import subsystems, e.g. control system models from other tools, and integrate them with your Modelica models ...

Industries that rely on Wolfram's SystemModeler include but are not limited to aerospace, automotive, pharmaceuticals, systems biology, and electrical engineering.  ....  "

Monday, April 20, 2015

Business Reliability with System Modeler

I see that Wofram is starting to put out their system modeling software.    Here nicely described for reliability problems.   A number of times we used this basic concept for calibrating some business process models (BPM).   Should be easier to do with this software.

Reliability Analysis New in Wolfram SystemModeler 4.1 »
Mathematica's complete suite for reliability analysis, with functionality for reliability block diagrams, fault trees, and importance measures, can now be used with SystemModeler. Additional updates include FMI-standardized model import and model development. .... "

Wednesday, August 06, 2014

Techsolve's Miniviz

Local company Techsolve's software Miniviz helps companies understand if their machine tools are operating efficiently.   Enables small and medium sized companies to make better decisions about maintenance and reliability.  We worked with them on a number of analytics projects. More details in the Cincinnati Enquirer

Thursday, May 29, 2014

Intel in Your Autonomous Car

Intel chips for automobiles.  A big market, so the reach is obvious.  Specifically the mention of autonomous is interesting.   What are the specific needs of such a chip.  Speed and reliability? Embedded intelligence and connectivity.

 " ... Intel threw its research might behind the idea of autonomous cars, or automobiles that are safer and more efficient because they can take over the task of driving from humans.

Intel, the world’s biggest chip maker, will provide an “application ready platform” with its own processors and operating system for self-driving cars. The initiative is part of Intel’s larger campaign to provide the intelligence and connectivity for the internet of things, or smarter everyday devices. The effort will include Intel’s own research as well as investments in other companies from a $100 million car technology fund. .... " 

Wednesday, March 12, 2014

Defining Datability

A term new to me:   " ... Datability is the synthesis of big data and security, accountability and reliability.  ... "

And it is the focus topic of CeBIT 2014 (in Hannover, Germany, March 10 to 14, 2014).

Today we have petabytes of data sent through networks that is filtered, analyzed and stored. The challenge is to make sense out of that data, to make it human readable and interpretable, helping humans to make the correct decisions at the right time. This is big data and analytics, and it is only one side of the coin. The other side of the coin is about security, accountability and reliability. When billions of messages, photos, documents and files are sent every day, who cares about security, availability, integrity and confidentiality? Who is accountable in the case of leakages, unauthorized access, modification, deletion of sensitive personal data or industry confidential data? ... " 

Friday, February 14, 2014

Intel as Operating System for Big Data

Intel wants to be the operating system for Big Data.  Interesting thought.  They had better understand the analytics of it.  It is not just about the hardware.  Or even data manipulation software. Have not looked at exactly what this is, but I like the thought:   " ... Intel is continuing to build out its array of software tools for the Hadoop open-source big data processing framework, with an emphasis on the security and reliability features demanded by large enterprises. A Data Platform tools suite will become available in the next quarter as a free-of-charge but self-supported Enterprise Edition, as well as a subscription Premium Edition that provides features such as proactive security fixes, regular enhancements and live support. ... " 

Thursday, January 23, 2014

Google Adds Sources

Google adds sources to search results, via its Knowledge Graph.   A suggestion in the piece suggests that a better solution would be a reliability score based on consumer reviews.    But would every source be reliably reviewed?

Wednesday, September 18, 2013

P&G and Predictive Analytics

Nice example of the use of Predictive analytics by my alma mater.   " ... By converting to a smart building system that uses predictive analytics to detect building performance abnormalities before they occur, we not only achieved energy savings in our corporate real estate portfolio, but we also improved building equipment reliability and the physical comfort of our employees," says Larry Bridge, Global Facilities and Real Estate Governance manager at P&G. "This pilot program confirmed that by using IntelliCommand, we can significantly improve the productivity of our buildings and employees. ... "

Friday, August 02, 2013

Changing the Retail Experience

In GigaOM:  " ... Its Thanks to the mass market shift to e-commerce and the rise of maker-oriented sites, the way we shop is changing. Grommet a startup out of Boston wants to help innovative products navigate the changing retail landscape. ...  The Grommet team consists of four buyers, which Grommet calls “launch managers,” who vet potential products for things such as manufacturability, quality and reliability. They then help those products find an audience. In many ways Grommet is building a higher quality and less cheesy version of the infomercial and using the web to do it at a larger scale without crazy high studio or distribution costs. ... "