Engineering – Dataconomy https://dataconomy.ru Bridging the gap between technology and business Thu, 18 Aug 2022 14:22:29 +0000 en-US hourly 1 https://dataconomy.ru/wp-content/uploads/2022/12/DC-logo-emblem_multicolor-75x75.png Engineering – Dataconomy https://dataconomy.ru 32 32 Model-based enterprise is not a utopia (you just have to try) https://dataconomy.ru/2022/08/18/what-is-model-based-enterprise-mbe/ https://dataconomy.ru/2022/08/18/what-is-model-based-enterprise-mbe/#respond Thu, 18 Aug 2022 13:13:35 +0000 https://dataconomy.ru/?p=27499 Enter the 3D dimension and say hello to the model-based enterprise (MBE) strategies. Costly and inefficient paper-based systems are outdated now. Across the full product lifecycle, model-based enterprise strategies for company-wide digital, model-based communication are more effective than using any 2D drawings. Implicit knowledge is key in the lifecycle of a product. Therefore, technical communication […]]]>

Enter the 3D dimension and say hello to the model-based enterprise (MBE) strategies. Costly and inefficient paper-based systems are outdated now. Across the full product lifecycle, model-based enterprise strategies for company-wide digital, model-based communication are more effective than using any 2D drawings.

Implicit knowledge is key in the lifecycle of a product. Therefore, technical communication within an organization is essential. These are essential elements to ensure that production efforts are not hampered, inspection procedures are thorough, and assembly techniques are not improperly fitted. As technology advances for defining products, so does the nature of the information. So let’s take a closer look at Model-Based Enterprise approach in product lifecycle management.

What is model-based enterprise (MBE)?

A digital three-dimensional (3D) model of a product that has been annotated serves as the authoritative information source for all actions over the course of that product’s lifespan. This method is known as a model-based enterprise (MBE) in the manufacturing industry.

MBE was first developed in the automotive and aerospace industries. Up to that point, MBE had been embraced by the U.S. Department of Defense (DoD) and many other large and small firms, from consumer electronics manufacturers to commercial airlines.

MBE eliminates expensive and ineffective paper-based solutions is one of its greatest benefits. The industry standard has been 2D drawings with a single annotated view of a part or product. However, as with any paper-based document system, version control, collaboration, and the simplicity and speed of communication are significant issues.

Model-based enterprise is not a utopia (you just have to try)
Model-based enterprise reduce delivery times

The global model-based enterprise market is anticipated to expand by 29.35 billion USD in 2027, according to Researchandmarkets.

When using MBE, a single CAD model holds all the data generally included in a complete set of engineering drawings. Geometries, dimensions, tolerances, materials, manufacturing terminologies like weld call-outs and assembly fit-up details are all included. The model may also include data on suppliers and the supply chain.

To secure the value of comprehensive data sets (models, drawings, and derived data) as the product definition, MBE policies and techniques were created. The data set must be complete, correct, under control, and maintained as the foundation for corporate cooperation and reuse to transition into an MBE truly. For adoption to be widespread, the data must be accessible and trusted by users.

By concentrating on creating a single CAD model that centralizes all technical data, the engineering department employs MBE and MBD to accelerate design and enhance cooperation. Processes can be designed and tested in a low-cost virtual factory before being employed in a real manufacturing setting. The 3D models can frequently be run in digital manufacturing software to design and simulate manufacturing processes.

What is MBE in engineering?

Model-based enterprise (MBE) is an engineering technique that uses a 3D model-based definition (MBD) that contains all the product and manufacturing information (PMI) necessary to manufacture the product to explain design intent during the manufacturing process.

Benefits of model-based enterprise

Having all of your model data in one place might seem like a utopian dream. Some engineers, nevertheless, are beginning to recognize the benefits of model-based definition (MBD) and employ it as the cornerstone of their enterprises. With MBD, engineers can organize all of their model designs and comments in one location and begin to appreciate the full potential of their 3D assets.

These are the benefits of model-based enterprise:

  • Reduced delivery times
  • Boost performance
  • Reduced scrap and rework
  • Improved product quality
  • Increased use of 3D resources

By utilizing MBD, market leaders are beginning to improve their information exchange, efficiency, and model utilization. As a result, they are transitioning from traditional businesses to model-based enterprises. However, many engineers are still skeptical about the true worth of MBD. So, let’s take a closer look at them.

Reduced delivery times

Engineers may add all their annotations to the 3D model using MBD, saving time on the time-consuming process of reproducing annotated 2D CAD drawings. Additionally, since the manufacturers already have access to this information, it saves time to provide it to them. By saving this time, delivery times will be drastically shortened, allowing your goods to reach the market much more quickly.

Boost performance

Once more, passing only one file across the departments is made possible by having one file with all of the annotations attached. Additionally, you won’t have to waste time updating two files and confirming that they contain identical data.

This reduces the possibility that information may be missed or misunderstood throughout your conversations. Additionally, any unneeded data is removed immediately, enabling departments to do their tasks without sifting through extraneous information.

Reduced scrap and rework

A single source of information for manufacturers is also made available by using MBD models. This means that information necessary for precise manufacturing is included in the model and is given to the department transparently.

Model-based enterprise is not a utopia (you just have to try)
Model-based enterprise reduces scrap and rework

Additionally, downstream consumers will find the information easier to interpret with a 3D annotated model rather than a 2D drawing. This readily available and understandable information will lessen the possibility that models will need to be abandoned or revised due to misunderstandings.

Improved product quality

Since communication is so straightforward, no information required for manufacturing is lost or misunderstood. As a result, fewer mistakes or non-conformances are made during the production process. This information on the model will demonstrate its excellent quality because most 3D software enables engineers to recreate their model precisely.

Increased use of 3D resources

MBD improves information and communication and makes it possible to use the 3D model more efficiently. Engineers and manufacturers may make the most of the information on the 3D asset, comprehend it better, and generate better products as a result, rather than referring to its twin 2D drawing.

Disadvantages of model-based enterprise

Like everything else in the world model-based enterprise has some disadvantages. These are:

  • Adaptation: Each change entails a price and necessitates personnel training.
  • Increasing complexity: Gathering all the information together can be a complex process.

Steps to realizing a model-based enterprise

It takes time to make the transition to a model-based enterprise. Here are five essential elements for your success:

  • Visibility
  • Keep drawings
  • Execute CAD standards
  • Preparing for the change

Let’s examine them more closely.

Visibility

The simplest measures to assist the change to an MBE are visible executive backing and transparent governance. Without executive support or governance, the effort will undoubtedly be constrained before it even begins.

A clear governance process that removes obstacles and enables change adoption in all impacted organizational functions must be used by executives to proactively justify the need for change, communicate the organizational benefits to users, announce progress, manage expectations, and manage expectations. A visible leadership style will also lessen the transition’s short-term productivity decline.

Keep drawings

Many companies believe that MBE is about doing away with drawings. However, this is untrue. Although it is possible to build designs without drawings, model-based engineering is fundamentally about ensuring that the data you generate to define your product is complete, correct, in your control, and controlled throughout the product’s entire lifecycle, from conception to disposal.

Model-based enterprise is not a utopia (you just have to try)
Model-based enterprise: You will need drawing anyway

Making a standard set of templates for drawings, parts, and assemblies is the first step. then ensuring that configuration parameters that affect specific CAD system behavior have a very low degree of user flexibility. Ensure all users have access to the same CAD configuration settings, such as systems of units, views, notes, tolerances, arrow esthetics, cross sections, etc.

Execute CAD standards

Standard CAD practices must be established and enforced throughout the organization. What ought to be covered by a CAD standard? Leading models, drawing techniques, and change management techniques. Keep a record of the organization’s preferred procedures and standards.

The implementation of standards for CAD is a crucial MBE enabler, but these standards are useless without enforcement that holds users accountable. Where people attempt to reuse data or base business choices on it, CAD without standards poses major issues.

CAD standards provide standardized CAD documentation and establish clear expectations. The bottom line is that standards promote data trust, which shortens the duration of development.

Model-based enterprise is not a utopia (you just have to try)
Model-based enterprise: Employ CAD standards

Additionally, it’s critical to remember that following good CAD guidelines does not automatically translate into good design. A good design must adhere to industry standards and client expectations. No matter how thorough your data or design strategy you use, a customer doesn’t care. But upholding standards will ensure that the customer’s expectations in terms of quality are met, and your coworkers will value the constancy.

Preparing for the change

The shift to a model-based enterprise is essentially a cultural adjustment. An active governance framework that works to protect the investment and increase the value of PLM and CAD tools are required for any shift this big.


Check out what is a data governance framework 


Culture change happens incrementally rather than abruptly. Avoid falling for the hype surrounding “elimination drawings.” Be mindful of the high standards and prepare reasonable MBE maturity milestones.

Conclusion

An approach where an annotated digital three-dimensional model of a product serves as the authoritative information source for all actions in that product’s lifetime is known as a “model-based enterprise” in the manufacturing industry. The MBE’s ability to replace digital designs is a crucial benefit. Because the drawbacks of drawings today harm business practices. MBE can considerably enhance the performance of manufacturing processes.

]]>
https://dataconomy.ru/2022/08/18/what-is-model-based-enterprise-mbe/feed/ 0
Machine learning engineering: The science of building reliable AI systems https://dataconomy.ru/2022/03/24/machine-learning-engineering/ https://dataconomy.ru/2022/03/24/machine-learning-engineering/#respond Thu, 24 Mar 2022 15:06:32 +0000 https://dataconomy.ru/?p=22757 Machine learning engineering aims to apply software engineering and data science methods to turn machine learning models into usable functions for products and consumers. Artificial intelligence technology is created using machine learning engineering with massive data sets. Machine learning engineering develops AI systems and algorithms to learn and ultimately make predictions. What is a machine […]]]>

Machine learning engineering aims to apply software engineering and data science methods to turn machine learning models into usable functions for products and consumers. Artificial intelligence technology is created using machine learning engineering with massive data sets. Machine learning engineering develops AI systems and algorithms to learn and ultimately make predictions.

What is a machine learning engineer?

Machine learning engineers are competent software developers who research, design, and implement autonomous programs to create predictive models. Engineers must evaluate, analyze, and organize data, execute experiments, and optimize the training procedure to construct high-performance machine learning models. An ML engineer usually works within a large data science team and collaborates with data scientists, administrators, analysts, engineers, and architects.

What are the responsibilities of a machine learning engineer?

The objective of a machine learning engineer is to design machine learning models and retrain systems as needed. Their responsibilities vary according to the organization, but there are a few universal duties for this position.

Machine learning engineers design, develop, and study machine learning systems, models, and schematics. Examines and transforms data science prototypes. Seeks out and picks suitable datasets. They use statistical analysis to improve models and visualize data to gain deeper insights. Engineers also analyze the use cases of machine learning algorithms and rank them according to their probability of success.

Machine learning engineering

Machine learning engineer salary and demand

AI projects fail because organizations lack technical knowledge, processes, tools, and know-how in deploying ML models. This challenge keeps the interest in machine learning engineering alive in many industries. In 2019, Indeed ranked machine learning engineer as the No. 1 job in the United States. The average base salary for an ML engineer in the US as of 2021 is $149,801, according to Indeed. According to Glassdoor, it’s lower at $127,326. However, salaries for machine learning engineers in well-known Silicon Valley companies ranging from $200,000 to over $250,000.

Machine learning engineering is not a career limited to tech-focused businesses. Despite the fact that it is a relatively new field, many organizations have already found success in applying machine learning to solve their problems. Machine learning expertise may be used by virtually any type of organization working with large amounts of data. Machine learning engineering is enabling businesses to get real-time insights from data and find ways to work more efficiently, which helps them gain a competitive advantage.

Over the past four years, the number of data science positions has increased by almost 75 percent and is projected to grow. Pursuing a career in machine learning is an excellent decision since it’s a high-paying profession that will be in great demand for years. Healthcare, education, marketing, retail and e-commerce, and financial services are among the industries that have already heavily invested in AI and machine learning.

How to become a machine learning engineer?

You must first acquire the necessary education and experience to become a machine learning engineer. Math, data science, computer science, computer programming, statistics, or physics are all acceptable bachelor’s degrees for machine learning engineering.

It’s unlikely that you’ll get your foot in the door as a machine learning engineer. You may need to choose a starting point such as software engineers, software programmers, data scientists, and computer scientists.

The majority of machine learning engineering jobs require more than an undergraduate degree. Seek a master’s or Ph.D. in data science, computer science, software engineering, or even a doctorate in machine learning to get one step closer to your dream job. Building a career as a machine learning engineer entails never-ending education. As technology advances, staying top on AI and cutting-edge technologies become more crucial. Understanding data structures, modeling, and software architecture is a must for this job.

What is the difference between machine learning engineering and a data scientist?

The primary distinction between a data scientist and a machine learning engineer is that the former focuses primarily on research, whereas the latter focuses on development. The two jobs have similar responsibilities in handling large amounts of data and necessitating specific qualifications, with both requiring comparable methods.

ML specialists focus on developing and managing AI systems and predictive models, while data scientists extract important discoveries from large data sets.

Data scientists are in charge of collecting, analyzing, and interpreting massive amounts of data. The data collected is used to construct hypotheses, draw conclusions, and analyze trends. The data scientists use complex analytics tools such as predictive modeling and machine learning procedures, mathematics, statistics, cluster analysis, and visualization abilities. Data scientists and machine learning engineers usually collaborate closely, and both need competent data management skills.

]]>
https://dataconomy.ru/2022/03/24/machine-learning-engineering/feed/ 0
Why an online, global workforce could be the future of construction https://dataconomy.ru/2019/08/08/why-an-online-global-workforce-is-the-future-of-construction/ https://dataconomy.ru/2019/08/08/why-an-online-global-workforce-is-the-future-of-construction/#comments Thu, 08 Aug 2019 09:35:03 +0000 https://dataconomy.ru/?p=20881 The talent shortage in architecture and engineering is huge. As noted by the World Economic Forum, the ongoing shortage of labour, including a shortage of professional talent for designers, architects, and higher levels of management, has “undermined project management and execution, adversely affecting cost, timelines and quality.” Ultimately, time is money – and the lack […]]]>

The talent shortage in architecture and engineering is huge. As noted by the World Economic Forum, the ongoing shortage of labour, including a shortage of professional talent for designers, architects, and higher levels of management, has “undermined project management and execution, adversely affecting cost, timelines and quality.”

Ultimately, time is money – and the lack of available, skilled workers and professionals is one of the biggest reasons for stagnating productivity and rising construction costs, with some reports estimating cost increases of 30 percent year-on-year.

This is a frustrating status quo for companies in the industry because this does not need to be the working reality, specifically when it comes to engineering and architecture. In fact, there are skilled engineers and architects with transferable skills all over the globe – but therein lies the problem. The workers are there, yet spread over countries. Globalization through digitized, connected systems could be the answer for construction’s productivity problem – but it remains to be seen how willing and able companies are to such an international, digital solution.

Working scenario in the construction industry 

Hiring practices are yet to shift significantly in construction. The process to onboard engineers and architects and staff projects remain more or less the same as previous decades, with managers preferring personal networks and word-of-mouth to acquire talent. These methods do work well to source from local talent pools, however, fail to unlock national or international workers who can work remotely.

This is a problem when projects need to be completed yet architects or engineering designers are scarce. Further, this is exacerbated by the fact that construction remains one of the least digitized industries on the planet, making for a system which is not optimized for employment efficiencies. According to a digitization index by MGI, the industry lags behind its contemporaries in tech uptake, with construction coming in second last and last place respectively in the U.S. and European lists.

This lack of digital development impacts upon the entire industry’s productivity. For example, annual global labour productivity growth in construction has averaged one percent over the past two decades. Consider this number in tandem with the growth of the total world economy (2.8 percent) and the manufacturing sector (3.6 percent) over the same period. The industry has lagged – and continues to lag – behind other industry benchmarks. Clearly, core changes are needed to modernize how construction companies operate and hire. 

Digitisation and remote work in construction 

Freelancing is already taking the world by storm – with more than half of all U.S. workers predicted to go freelance within the next decade – and one can predict this trend to influence engineering and architecture hires going forward. This marks a shift from the way engineering professionals have been hired in the past – where companies have focused mostly on local talent for local projects. However, this method takes full advantage of the fact that most architects and engineers work project to project and have geographically transferable skills. 

Getting to this point will require a paradigm shift, but the transition should be assisted by the clear benefits such hiring processes offer. For example, hiring globally instead of locally would unlock entire workforces with both employers and employees available to pick and choose working conditions and commit accordingly. The arrival of the gig economy to construction means the removal of talent-management overheads, savings on recruitment cost and time, and human resource processes. It affords professionals and companies the freedom to pick and choose the jobs they want.

Moreover, the construction site itself may soon take a dose of digitization and remote work tools that optimize monitoring and reporting of progress, and construction tech startups are leading the way. Drones, monitoring equipment, and project management software that links the site’s progress to the office means that engineers will be able to monitor work from anywhere, and be virtually present on site. This breaks down another barrier of localization and enables a construction engineer in another state or country to supervise construction.

The future of workforce dynamics in construction

So, what is stopping this from happening? The biggest hurdle to overcome remains the industry’s lack of digitization. Taking workers and processes online simply will not happen if construction does not embrace some kind of digital revolution. The fact that the same industry which creates megastructures and undersea tunnels remains largely analog beggars belief. Therefore, for this improved, efficient, and cost-effective future to occur, the industry must first review its processes and analyze where digital progression should take place.

This requires a willing and able mindset – but that’s easier said than done. Construction is one of the most traditional industries in the world and ingrained attitudes and ideas are tough to dislodge. Management teams and company heads will need to be flexible and teachable if they are going to change and relearn entire processes.

However, change is always easier to implement when the potential results are backed up by data – and the data clearly shows that efficiencies will result from digitization and globalization. Yes, talent shortage is prevalent in the U.S. and most western countries, but on a global level, the talent is clearly there for the picking. It just needs better access.

]]>
https://dataconomy.ru/2019/08/08/why-an-online-global-workforce-is-the-future-of-construction/feed/ 2
Data Analytics Is The Key Skill for The Modern Engineer https://dataconomy.ru/2017/04/24/data-analytics-modern-engineer/ https://dataconomy.ru/2017/04/24/data-analytics-modern-engineer/#comments Mon, 24 Apr 2017 09:00:13 +0000 https://dataconomy.ru/?p=17769 Many process manufacturing owner-operators in this next phase of a digital shift have engaged in technology pilots to explore options for reducing costs, meeting regulatory compliance, and/or increasing overall equipment effectiveness (OEE). Despite this transformation, the adoption of advanced analytics tools still presents certain challenges. The extensive and complicated tooling landscape can be daunting, and […]]]>

Many process manufacturing owner-operators in this next phase of a digital shift have engaged in technology pilots to explore options for reducing costs, meeting regulatory compliance, and/or increasing overall equipment effectiveness (OEE).

Despite this transformation, the adoption of advanced analytics tools still presents certain challenges. The extensive and complicated tooling landscape can be daunting, and many end users lack fundamental understanding of process data analytics. Combined with a lack of awareness of the practical benefits that analytics offer, this leaves many engineers stuck in day-to-day tasks, using spreadsheets and basic trend analysis tools for the bulk of their daily analysis.

In this article we discuss the need for improved analytics awareness for the modern process engineer. We also explore key considerations in creating such awareness and the capabilities that state-of-the-art self-service analytics tools offer for process performance optimization.

Connected IIoT and Data

Today factories are producing more data than ever, forming an Industrial Internet of Things (IIoT) that enables smart factories where data can be visualized from the highest level to the smallest detail. The key to this digital revolution is the network of connected sensors, actors and machines in a plant generating trillions of samples per year.

Data Analytics Is The Key Skill for The Modern Engineer

This digital revolution offers unprecedented opportunities for improving efficiency and real-time process management – but it also presents new challenges that require innovative solutions and a new way of thinking.

Technology has evolved rapidly in response to the scale of data generated, with systems for business intelligence and data lakes now an essential part of operational excellence. However, for many engineers little has changed. They use the same systems and experience few benefits from the digital transformation taking place in their plants as they are unable to directly access the insights this new data provides.

Complexities in Analytics Options

Engineers now face a complex landscape populated with a variety of analytics tools, all of which promise to make sense of the newly available data, including tools from traditional historians and MES (manufacturing execution system) vendors, generic big data systems such as Hadoop and independent analytics applications. These tools address a variety of business needs, but are not necessarily designed to meet the specific needs of engineers in the process industry.

The sheer number of business systems leads to issues with integration and increased reliance on IT and big data experts. The corporate analytics vision is often based on one big data lake for all data, and proof of concepts are launched to store finance data, marketing data, quality data and limited amounts of production data in such lakes. However, companies frequently struggle to fit in the massive time series data from processes in these exercises.

In response, many organizations create central analytics teams to address the most critical process questions affecting profitability. Data scientists create advanced algorithms and data models to combine data from multiple sources and deliver insights to optimize production processes. These analytics experts lead the way in translating time series data into actionable information.

While the insights gained from analytics teams are essential, this approach alone is insufficient to enable engineers to leverage analytics in their daily tasks. Engineers are time-poor, with little room to learn new tools; they are more concerned with meeting the immediate needs of the plant than the promise of new and perhaps unproven technologies. They may be skeptical that they will gain practical benefits from investing time in the analytics system(s). If past analytics projects have failed to meet their expectations, there may also be frustration and disappointment. With the pressing need to ensure optimal processes, it is natural that they will revert to their current systems and tools as proven ways to get the job done.

Educating Users to Build the Perfect Beast

Just as technology has evolved to create connected plants, so engineers must be empowered to manage these factories. This is a critical shift in business culture as the entire organization must be educated and made aware of the potential of analytics as it applies to their role.

Instead of relying solely on a central analytics team that owns all the analytics expertise, subject matter experts such as process engineers should be empowered to answer their own day-to-day questions. Not only will this spread the benefits to the engineers involved in process management, it will also free the data scientists to focus on the most critical business issues.

Data Analytics Is The Key Skill for The Modern Engineer

Enabling engineers does not mean asking them to become data scientists – it means providing them with access to the benefits of process data analytics. Process engineers will not (easily) become data scientists because the education background is different (computer science versus chemical engineering). However, they can become analytics aware and enabled.

By bringing engineers closer in their understanding of analytics, they can solve more day-to-day questions independently and enhance their own effectiveness. They will in turn provide their organizations with new insights based on their specific expertise in engineering. This delivers value to the owner-operator at all levels of the organization and leverages (human) resources more efficiently.

To bring an organization to this modern approach requires the addition of a self-service analytics platform tailored to the subject matter expert users’ needs and the education of users.

Self-service analytics tools are designed with end users in mind. They incorporate robust algorithms and familiar interfaces to maximize ease of use without requiring in-depth knowledge of data science. No model selection, training and validation are required; instead users can directly query information from their own process historians and get one-click results. Immediate access to answers encourages adoption of the analytics tool as the value is proven instantly: precious time is saved and previously hidden opportunities for improvement are unlocked.

This self-service approach to analytics results in heightened efficiency and greater comfort with use of analytics information for the engineers, allows data scientists to focus on the questions most critical to the entire organization, and delivers enhanced profitability for owner-operators.

Like this article? Subscribe to our weekly newsletter to never miss out!

]]>
https://dataconomy.ru/2017/04/24/data-analytics-modern-engineer/feed/ 2
3 I’s of data-driven engineering https://dataconomy.ru/2016/07/14/3-is-of-data-driven-engineering/ https://dataconomy.ru/2016/07/14/3-is-of-data-driven-engineering/#respond Thu, 14 Jul 2016 08:00:02 +0000 https://dataconomy.ru/?p=16102 Where do I get started with data-driven engineering? How can the 3 I’s of data-driven engineering help me get off to a running start? How can I avoid the common pitfalls of data-driven engineering? What are the 3 I’s? The 3 I’s of data-driven engineering are insights, indicators and investments. What are insights? Insights are […]]]>

Where do I get started with data-driven engineering? How can the 3 I’s of data-driven engineering help me get off to a running start? How can I avoid the common pitfalls of data-driven engineering?

What are the 3 I’s?

The 3 I’s of data-driven engineering are insights, indicators and investments.

What are insights?

Insights are observations we derive from data generated by the software under test. They are the equivalent of observations we make about the product in the world of software testing. In software testing, we observe the software behavior, and we summarize these observations as feedback to the team. Insights are no different. We are still observing the software behavior. However, we are focusing on different ways of observing software behavior (e.g. logs, telemetry, CPU stats). In my mind, getting insights about our software is software testing.

What are indicators?

Indicators tell us that something could potentially be wrong. They indicate to us that an investigation and analysis should be done.

What are investments?

Investments are what we do based on our insights and indicators.

Why are the 3 I’s important? Why should I do this?

With the 3 I’s under your belt, you now have a framework to attack an engineering or testing problem driven by data, but why do this at all? Why not just put up a dashboard of indicators, call it a day and let the engineering team see the issues so that they fix them?

In an ideal world, all issues would be addressed as soon as they are surfaced. Hidden inside insights are 2 more I’s: investigations and interpretation. Even with the best indicators, data-driven engineers do investigations to interpret the indicators. They may dig into related data, try to correlate this with other indicators or trace through the source code. Based on the insights collected from investigating and interpreting the results, data-driven engineers push for investments or make the investments themselves.

Couldn’t all of this be automated?

Of the 3 I’s, indicators are well-suited for automation. Putting up a dashboard of indicators on a real-time basis is a job best done by a computer. However, the role of designing the indicator, deriving insights from investigations, and interpreting those insights into actionable investments is best suited for… you guessed it, a data-driven engineer.

Of all the 3 I’s (5 if you include investigations and interpretation), displaying indicators for the team can be easily automated, but other activities are uniquely human.

How can I get started?

Apply the 3 I’s!

Software testers are continuously questioning the product… Does it break when stress it this way? Does it allow me access to something I have no permission for? What if change the order of how I do certain steps?

Data-driven engineers are continuously questioning the product, too! What are the top server errors? What are the top client crashes? How many users do these errors impact? How many users have stopped using the product? How many new users are using it? Why are we seeing these trends?

Start with something you want to know about the product, and try to answer it with the data available. If it’s not available, then your investigation just yielded an insight! Take that insight and translate it into an actionable investment e.g. instrument the code so that we have the data to answer our question.

What are common mistakes?

You put up an indicator, because you have it available. Insights and indicators are tied together. The indicator must mean something to the business. For example, CPU usage is a nice well-defined indicator, but what does it mean to the user? Do we have data that says that increased CPU usage results in customers eventually giving up on the software? What’s the threshold for when this really becomes a problem?

You leave out the insights and investments. Your indicators, data and findings are required for data-driven engineering. However, your insights and investments are even more important. What opinion did you form based on all the indicators and data? What insights did you derive from your investigations? What should we invest in based on the interpretations? Link your findings to the business, and all will be right in the world.

This post originally appeared on rayli.net

Like this article? Subscribe to our weekly newsletter to never miss out!

]]>
https://dataconomy.ru/2016/07/14/3-is-of-data-driven-engineering/feed/ 0
Alex Lo of Flatiron Health on Data Corralling https://dataconomy.ru/2015/09/29/alex-lo-of-flatiron-health-on-data-corralling/ https://dataconomy.ru/2015/09/29/alex-lo-of-flatiron-health-on-data-corralling/#respond Tue, 29 Sep 2015 15:26:59 +0000 https://dataconomy.ru/?p=14104 What’s the most misunderstood thing about “Big Data”? In my experience, corralling data is harder than people perceive and analyzing is easier than most people perceive. Not that either is easy at all, just relative to each other. With corralling, if there’s any interruption at the front end, it quickly affects things downstream. Anticipating those […]]]>

What’s the most misunderstood thing about “Big Data”?

In my experience, corralling data is harder than people perceive and analyzing is easier than most people perceive. Not that either is easy at all, just relative to each other.

With corralling, if there’s any interruption at the front end, it quickly affects things downstream. Anticipating those downstream consequences and designing ingestion to be fault tolerant is difficult to do. Additionally, it can be very difficult to distinguish dirty data from clean.

Another challenge is to establish rules to identify dirty data . You can set up hard bounds, but that doesn’t help in all cases. It’s preferable to take a more statistical approach with anomaly detection. Splunk offers some rudimentary tools, but even simple statistics are way better than absolute bounds.

Why do people underestimate?

The data organizations I’ve been a part of spent way more time corralling data than they do analyzing it in terms of assignment efforts. The ratio is probably close to 5:1. Making software work is 10% of the work; making it work for exceptional cases (data errors, out of order events, network problems) is the other 90%.

Example?

When I worked at Quantitative Risk Management, we set up an API for one of our vendors to drop data every day. We built pretty simple extraction rules and counted on our provider to send us good, clean data. One day, columns were missing and it broke everything. We were caught unawares because we hadn’t planned for a schema change to be possible. In retrospect, that was naïve, but we’d been using this vendor for a while and designed our software based on that assumption.

What’s the one secret not being talked about with respect to your role and data?

People overlook how difficult it is to handle an ever-increasing data size; where do you keep it and what do you do with it? At Nomi, we kept every observation our WiFi sensors detected forever. We built our software with the assumption that we would keep every data point forever.

We understood that we’d face challenges, but also underestimated how difficult it would be to disentangle ourselves from that methodology. Through that experience, I learned that no matter how early you are as a start-up, establish up front assumptions around when and how you plan on accessing data. But, if it’s already too late and you’ve got data zooming in and performance is starting to suffer, there are mechanisms that can help. Depending on what kind of data store you’re using, there are tools to expire or archive data. MongoDB has capped collection that starts to drop data that isn’t so important once you reach a certain threshold. Other data stores like S3 have lifecycle management mechanisms that keep data hot for a given period of time, like 60 days, and then automatically purge thereafter.

What’s the most frustrating thing about your work with data?

The most frustrating thing is developing processes around data that end up being fragile or require maintenance. When I was on the Market Data team for QRM, we would get a data dump at 8am, run our process on it, and release it to clients at 9am. If there was a QA issue, we had a very short period of time to detect and resolve it.  When things go well, clients don’t think much of it; they’re supposed to go well.  If things go wrong, they’ll remember it forever. Even though our vendor was quite reliable and there were mistakes maybe 1% of the time, it didn’t matter; clients have a long memory.

What I learned from this was that the more fragile your process are, the more leeway you want to bake in. Automate as much as you can. We automated detection of the format of our input so we could tell if a given input was ingestible. We then established ranges around input values. In an ideal world, we wish we could have implemented a more sophisticated approach using statistical ranges, rather than absolute values as thresholds. There’s such a large variance that hard bounds are not always effective.

What’s the career accomplishment you’re most proud of?

While I was at Nomi, we scaled from 300M data points a week to >1B. Trying to figure out how to push our systems to the limit was intellectually rewarding. We had to experiment ways to get higher throughput given constraints.

Who has been the most important mentor in your career?

I’ve been fortunate to have many people mentor me from a number of perspectives.  I see being in technology as always 20% learning / 80% practice. One person in particular who was instrumental in my shift from the application side to infrastructure was Michael Hamrah (now at Uber).  He gave a fascinating presentation on how he was able to facilitate experimentation and scalability by designing a decentralized, easy-to-implement infrastructure. By decoupling and designing with usability in mind, his premise was that you could remove the barriers to experimentation.

Even though Michael didn’t put it this way, I think our friends on the Oscar Health infrastructure team (Mackenzie Kosut and Brent Langston) put it best” “Make everything simple and reproducible”. If you force yourself to make every operation repeatable, you automate more. As a result, you’re able to test your system against a new data set without any penalty in a way that encourages experimentation.

Adopting these principles can be easier said than done. In many cases, you are not designing and building something new from the ground up; you’re making incremental improvements to existing infrastructure. It’s difficult to balance the need to support something that’s currently working while trying to conceptualize and design its next generation. Balancing those two competing concerns isn’t easy. I wouldn’t say I’ve been great at it, but what I have observed is that the big rewrite from the ground up is rarely successful. It’s better to find some way to incrementally reduce constraints or increase capabilities.

What company is doing “Big Data” right?

I am very excited by things going on with Kafka led by http://www.confluent.io/. Turning data producers into first class citizens is the next gen of data architectures. I find it very interesting to apply a push model to data architecture, versus the more traditional ‘pull’ approach. A lot of a data system’s end goal is to produce a data product like an analysis, data dump, or table that can be queried. Generally that’s done through a batch job in which it pulls the required data necessary to create the product. Kafka and other new architectures are turning that model on its head, allowing producers of data to trigger the action (e.g. the origination of a data point or API call to cause the creation of the final output). That sort of push-based architecture is more reproducible, and the approach allows you to have a single root data stream that is being consumed by two versions and that you can compare them head to head, which you can’t do with other systems.

What’s the one piece of advice you would share with a younger version of yourself?

The person who is most responsible for your personal development is yourself.  Other people are consultants.  If you find yourself working in a place that does not emphasize your learning and growth, choose the most challenging projects whenever possible. Try to stay up-to-date outside of work. Attend Meetups, talk to peers, and contribute to open source projects. Part of working is investing in yourself and places that don’t recognize that are reducing your growth potential.

[bctt tweet=”The person who is most responsible for your personal development is yourself.”]

What would you like to ask readers of this interview?

How are other people keeping up? I always find it quite challenging so I’m curious to hear what people are reading or attending.

This interview is part of Schuyler’s Data Disruptors series over on the StrongDM blog.

]]>
https://dataconomy.ru/2015/09/29/alex-lo-of-flatiron-health-on-data-corralling/feed/ 0
19-21 December, 2014- 17th IEEE International Conference on Computational Science and Engineering, Chengdu, China https://dataconomy.ru/2014/12/11/19-21-december-2014-17th-ieee-international-conference-on-computational-science-and-engineering-chengdu-china/ https://dataconomy.ru/2014/12/11/19-21-december-2014-17th-ieee-international-conference-on-computational-science-and-engineering-chengdu-china/#respond Thu, 11 Dec 2014 15:11:35 +0000 https://dataconomy.ru/?p=10960 The Computational Science and Engineering area has earned prominence through advances in electronic and integrated technologies beginning in the 1940s. Current times are very exciting and the years to come will witness a proliferation in the use of various advanced computing systems. It is increasingly becoming an emerging and promising discipline in shaping future research […]]]>

The Computational Science and Engineering area has earned prominence through advances in electronic and integrated technologies beginning in the 1940s. Current times are very exciting and the years to come will witness a proliferation in the use of various advanced computing systems. It is increasingly becoming an emerging and promising discipline in shaping future research and development activities in academia and industry, ranging from engineering, science, finance, economics, arts and humanitarian fields, especially when the solution of large and complex problems must cope with tight timing schedules.

CSE2014 will be held in Chengdu, China on December 19-21, 2014. It is to bring together computer scientists, industrial engineers, and researchers to discuss and exchange experimental and theoretical results, novel designs, work-in-progress, experience, case studies, and trend-setting ideas in the areas of advanced computing for problems in science and engineering applications and inter-disciplinary.

Full speaker line-up and registration can be found here.

]]>
https://dataconomy.ru/2014/12/11/19-21-december-2014-17th-ieee-international-conference-on-computational-science-and-engineering-chengdu-china/feed/ 0