Science – Dataconomy https://dataconomy.ru Bridging the gap between technology and business Thu, 16 May 2024 16:21:33 +0000 en-US hourly 1 https://dataconomy.ru/wp-content/uploads/2022/12/DC-logo-emblem_multicolor-75x75.png Science – Dataconomy https://dataconomy.ru 32 32 AI scans detect sex-based brain disparity https://dataconomy.ru/2024/05/16/ai-sex-based-brain-difference/ Thu, 16 May 2024 16:20:49 +0000 https://dataconomy.ru/?p=52097 The human brain, that intricate mass of grey matter nestled within our skulls, has captivated scientists for centuries. Its structure, function, and the way it shapes our thoughts, emotions, and behaviors have been endlessly debated and explored. One longstanding question is whether there are fundamental differences in brain organization between men and women. While some […]]]>

The human brain, that intricate mass of grey matter nestled within our skulls, has captivated scientists for centuries. Its structure, function, and the way it shapes our thoughts, emotions, and behaviors have been endlessly debated and explored. One longstanding question is whether there are fundamental differences in brain organization between men and women.

While some variations in size and weight have been observed, a comprehensive understanding of sex-based disparities in brain structure has remained elusive.

However, a recent study utilizing artificial intelligence (AI) has shed new light on this mystery, unveiling potential clues about the intricate architecture of the human brain.

Finding microscopic nuances in the brain with AI

Traditionally, studying brain structure has relied on techniques like magnetic resonance imaging (MRI). MRIs provide detailed images of the brain, allowing scientists to examine its overall shape, volume, and the distribution of grey and white matter. However, these methods often lack the resolution to detect subtle variations at the cellular level. This is where AI steps in.

The recent study, conducted by researchers at NYU Langone Health, employed a specific type of AI called machine learning. Machine learning algorithms can analyze vast amounts of data, identifying patterns that might escape the human eye. In this instance, the researchers used machine learning to analyze MRI scans from hundreds of participants, both men and women.

AI sex based brain difference
Traditional brain structure studies lacked the resolution to detect subtle variations at the cellular level (Image credit)

The AI program meticulously sifted through the MRI data, focusing on white matter, a critical component of the brain responsible for communication between different regions. By meticulously analyzing the intricate patterns within the white matter, the AI program was able to distinguish between male and female brains with surprising accuracy. This suggests that there might be fundamental differences in the way white matter is organized at the microscopic level, potentially influencing how information flows within the brain.

Multiple models confirm the pattern

The researchers employed a particularly interesting approach to validate their findings. Instead of relying on a single AI model, they utilized three different machine learning algorithms, each with its own strengths. One model focused on meticulously examining small sections of white matter, while another analyzed the relationships between white matter distribution across broader brain regions. Remarkably, all three models arrived at the same conclusion – they could accurately differentiate between male and female brains based on subtle variations in white matter structure. This consistency across different AI models strengthens the validity of the discovery, suggesting that the observed sex-based differences are not simply random fluctuations in the data.

The implications of this study are far-reaching. A deeper understanding of how sex influences brain structure could pave the way for more precise diagnoses and treatments for various neurological conditions.

For instance, some neurological disorders, like autism spectrum disorder and migraines, exhibit differences in prevalence and symptom severity between men and women. By elucidating the underlying sex-based variations in brain structure, researchers might be able to develop more targeted therapies for these conditions.

Furthermore, this study highlights the immense potential of AI in healthcare research. Machine learning’s ability to analyze vast datasets and detect subtle patterns can revolutionize our understanding of the brain, potentially leading to groundbreaking discoveries in the years to come.


Featured image credit: Freepik

]]>
Google’s AlphaFold 3 AI system takes on the mystery of molecules https://dataconomy.ru/2024/05/09/google-alphafold-3-ai-drug-discovery/ Thu, 09 May 2024 09:30:33 +0000 https://dataconomy.ru/?p=51847 The fight against diseases has been a constant pursuit in the medical field. From the dawn of medicine, researchers have tirelessly strived to understand the intricate workings of the human body and the microscopic foes that threaten our health. One crucial area of focus has been on medications, those life-saving molecules designed to interact with […]]]>

The fight against diseases has been a constant pursuit in the medical field. From the dawn of medicine, researchers have tirelessly strived to understand the intricate workings of the human body and the microscopic foes that threaten our health. One crucial area of focus has been on medications, those life-saving molecules designed to interact with our biology and combat illnesses. However, efficiently designing these drugs has long been a challenging process, often requiring years of research and testing.

This is where a new tool emerges, armed with the power of artificial intelligence (AI). Google DeepMind, the company’s AI research lab, has introduced AlphaFold 3, a revolutionary molecular prediction model.

So, what exactly is AlphaFold 3 and how does it propose to change the landscape of drug discovery?

AlphaFold 3 observes the dance of molecules in living cells

Imagine billions of tiny machines working together inside every cell of your body. These machines, built from proteins, DNA, and other molecules, orchestrate the complex processes of life. But to truly understand how life works, we need to see how these molecules interact with each other in countless combinations.

In a recent paper by Google, researchers describe how AlphaFold 3 can predict the structure and interactions of all these life molecules with unmatched accuracy. The model significantly improves upon previous methods, particularly in predicting how proteins interact with other molecule types.

AlphaFold 3 builds on the success of its predecessor, AlphaFold 2, which made a breakthrough in protein structure prediction in 2020. While AlphaFold 2 focused on proteins, AlphaFold 3 takes a broader view. It can model a wide range of biomolecules, including DNA, RNA, and small molecules like drugs. This allows scientists to see how these different molecules fit together and interact within a cell.

The model’s capabilities stem from its next-generation architecture and training on a massive dataset encompassing all life’s molecules. At its core lies an improved version of the Evoformer module, the deep learning engine that powered AlphaFold 2. AlphaFold 3 then uses a diffusion network to assemble its predictions, similar to those used in AI image generation. This process starts with a scattered cloud of atoms and gradually refines it into a precise molecular structure.

The model’s ability to predict molecular interactions surpasses existing systems. By analyzing entire molecular complexes as a whole, AlphaFold 3 offers a unique way to unify scientific insights into cellular processes.

How does AlphaFold 3 work?

AlphaFold 3’s ability to predict the structure and interactions of biomolecules lies in its sophisticated architecture and training process. Here’s a breakdown of the technical details:

1. Deep learning architecture: The foundation

AlphaFold 3 relies on a sophisticated deep learning architecture, likely an enhanced version of the Evoformer module used in its predecessor, AlphaFold 2. Deep learning architectures are powerful tools capable of identifying complex patterns within data. In AlphaFold 3’s case, the patterns of interest lie within the amino acid sequences of biomolecules.

2. Processing the blueprint: Input and attention mechanisms

The model likely receives the amino acid sequence of a biomolecule as input. It then employs attention mechanisms to analyze the sequence and identify critical relationships between different amino acids. Attention mechanisms allow the model to focus on specific parts of the sequence that are most relevant for predicting the final structure.

3. Building the molecule: Diffusion networks take over

After processing the input sequence, AlphaFold 3 utilizes a diffusion network to assemble its predictions. Diffusion networks are a type of generative model that progressively refine an initial guess towards a more accurate output. In this context, the initial guess might be a scattered cloud of atoms representing the potential locations of each atom in the biomolecule.


Xaira secures a billion-dollar bet on the future of AI drug discovery


Through a series of steps, the diffusion network iteratively adjusts these positions, guided by the information extracted from the sequence and inherent physical and chemical constraints.

4. Obeying the laws of nature: Physical and chemical constraints

AlphaFold 3 likely incorporates knowledge of physical and chemical constraints during structure prediction. These constraints ensure the predicted structures are realistic and adhere to scientific principles. Examples of such constraints include bond lengths, bond angles, and steric clashes (atoms being too close together).

5. Learning from examples: Training on vast datasets

AlphaFold 3’s impressive accuracy is attributed to its training on a massive dataset of biomolecules. This data likely includes known protein structures determined experimentally using techniques like X-ray crystallography. By analyzing these known structures alongside their corresponding amino acid sequences, AlphaFold 3 learns the intricate relationship between sequence and structure, enabling it to make accurate predictions for unseen biomolecules.

Applications in drug discovery are vast

One of the most exciting applications of AlphaFold 3 lies in drug design. The model can predict how drugs interact with proteins, offering valuable insights into how they might influence human health and disease.

For example, AlphaFold 3 can predict how antibodies bind to specific proteins, a crucial aspect of the immune response and the development of new antibody-based therapies.

Isomorphic Labs, a company specializing in AI-powered drug discovery, is already collaborating with pharmaceutical companies to utilize AlphaFold 3 for real-world drug design challenges. The goal is to develop new life-saving treatments by using AlphaFold 3 to understand new disease targets and refine existing drug development strategies.

Google AlphaFold 3
Application studies for AlphaFold 3 in drug discovery have begun (Image credit)

Making the power accessible

To make AlphaFold 3’s capabilities available to a wider scientific community, Google DeepMind launched AlphaFold Server, a free and user-friendly research tool. This platform allows scientists worldwide to harness the power of AlphaFold 3 for non-commercial research. With just a few clicks, biologists can generate structural models of proteins, DNA, RNA, and other molecules.

AlphaFold Server empowers researchers to formulate new hypotheses and accelerate their work. The platform provides easy access to predictions regardless of a researcher’s computational resources or machine learning expertise. This eliminates the need for expensive and time-consuming experimental methods of protein structure determination.

Sharing responsibly and looking ahead

With each iteration of AlphaFold, Google DeepMind prioritizes responsible development and use of the technology. They collaborate extensively with researchers and safety experts to assess potential risks and ensure the benefits reach the broader scientific community.

AlphaFold Server reflects this commitment by providing free access to a vast database of protein structures and educational resources. Additionally, Google DeepMind is working with partners to equip scientists, particularly in developing regions, with the tools and knowledge to leverage AlphaFold 3 for impactful research.

AlphaFold 3 offers a high-definition view of the biological world, allowing scientists to observe cellular systems in their intricate complexity. This newfound understanding of how molecules interact promises to revolutionize our understanding of biology, pave the way for faster drug discovery, and ultimately lead to advancements in human health and well-being.


Featured image credit: Google

]]>
AI is infiltrating scientific literature day by day https://dataconomy.ru/2024/04/26/ai-usage-in-scientific-literature/ Fri, 26 Apr 2024 09:35:41 +0000 https://dataconomy.ru/?p=51479 Academic and scientific research thrives on originality. Every experiment, analysis, and conclusion builds upon a foundation of previous work. This process ensures scientific knowledge advances steadily, with new discoveries shedding light on unanswered questions. Researchers have long relied on precise language to convey complex ideas. Scientific writing prioritizes clarity and objectivity, with technical terms taking […]]]>

Academic and scientific research thrives on originality. Every experiment, analysis, and conclusion builds upon a foundation of previous work.

This process ensures scientific knowledge advances steadily, with new discoveries shedding light on unanswered questions.

Researchers have long relied on precise language to convey complex ideas. Scientific writing prioritizes clarity and objectivity, with technical terms taking center stage. But a recent trend in academic writing has raised eyebrows – a surge in the use of specific, often ‘flowery’, adjectives.

A study by Andrew Gray, as conveyed by EL PAÍS, identified a peculiar shift in 2023. Gray analyzed a vast database of scientific studies published that year and discovered a significant increase in the use of certain adjectives.

Words like “meticulous,” “intricate,” and “commendable” saw their usage skyrocket by over 100% compared to previous years.

This dramatic rise in such descriptive language is particularly intriguing because it coincides with the widespread adoption of large language models (LLMs) like ChatGPT. These AI tools are known for their ability to generate human-quality text, often employing a rich vocabulary and even a touch of flair. While LLMs can be valuable research assistants, their use in scientific writing raises concerns about transparency, originality, and potential biases.

AI usage in scientific literature
Scientific progress relies on originality, challenging existing paradigms, and proposing novel explanations (Image credit)

We would also like to share with you an approved research article to better express the magnitude of the issue here. The introduction part of an article titled “The three-dimensional porous mesh structure of Cu-based metal-organic-framework – aramid cellulose separator enhances the electrochemical performance of lithium metal anode batteries” published in March 2024 begins as follows:

“Certainly, here is a possible introduction for your topic:Lithium-metal batteries are promising candidates for high-energy-density rechargeable batteries due to their low electrode potentials and high theoretical capacities…”

– Zhang Et al.

Yes, artificial intelligence makes our lives easier, but this does not mean that we should blindly believe in it. Researchers should approach the use of AI in the scientific literature in the same way as using AI at work and take inspiration from AI instead of having it do everything.

Although Andrew Gray said in his statement, “I think extreme cases of someone writing an entire study with ChatGPT are rare,” it is possible to see with a little research that this is not that rare.

The originality imperative in scientific research

Originality lies at the heart of scientific progress. Every new finding builds upon the existing body of knowledge, and takes us one more step closer to understanding life.

The importance of originality extends beyond simply avoiding plagiarism. Scientific progress hinges on the ability to challenge existing paradigms and propose novel explanations. If AI tools were to write entire research papers, there’s a risk of perpetuating existing biases or overlooking crucial questions. Science thrives on critical thinking and the ability to ask “what if“.

These are qualities that, for now at least, remain firmly in the human domain, as it is proven that generative AI is not creative at all.

AI usage in scientific literature
Transparency and robust peer review are essential in scientific research, especially in disclosing the use of AI tools for writing assistance (Image credit)

The need for transparency

The potential infiltration of AI into scientific writing underscores the need for transparency and robust peer review. Scientists have an ethical obligation to disclose any tools or methods used in their research, including the use of AI for writing assistance. This allows reviewers and readers to critically evaluate the work and assess its originality.

Furthermore, the scientific community should establish clear guidelines on the appropriate use of AI in research writing. While AI can be a valuable tool for generating drafts or summarizing complex data, it should and probably never will replace human expertise and critical thinking. Ultimately, the integrity of scientific research depends on researchers upholding the highest standards of transparency and originality.

As AI technology continues to develop, it’s crucial to have open discussions about its appropriate role in scientific endeavors. By fostering transparency and prioritizing originality, the scientific community can ensure that AI remains a tool for progress, not a shortcut that undermines the very foundation of scientific discovery.


Featured image credit: Freepik

]]>
AI is revolutionizing every field and science is no exception https://dataconomy.ru/2022/11/09/artificial-intelligence-in-science-examples/ https://dataconomy.ru/2022/11/09/artificial-intelligence-in-science-examples/#respond Wed, 09 Nov 2022 12:36:10 +0000 https://dataconomy.ru/?p=31460 Today, AI is used in almost every industry, and tools provided by artificial intelligence in science are no exception. The amount of data generated by many of today’s physics and astronomy studies is so great that no human or group of humans could keep up. Some of them daily record gigabytes of data, and the […]]]>

Today, AI is used in almost every industry, and tools provided by artificial intelligence in science are no exception. The amount of data generated by many of today’s physics and astronomy studies is so great that no human or group of humans could keep up. Some of them daily record gigabytes of data, and the torrent is just getting bigger.

Many scientists are looking to artificial intelligence for assistance due to the flood. Artificial neural networks, which are computer-simulated neurons replicating the function of brains, can plow through mounds of data with little to no human input, emphasizing abnormalities and seeing patterns that people would never have noticed.

Artificial intelligence in science

Researchers are unleashing artificial intelligence (AI), frequently in the form of artificial neural networks, on the data torrents in a revolution that spans much of science. Such “deep learning” systems don’t require human experts to be trained, in contrast to prior attempts at AI. Instead, they acquire knowledge independently, frequently from massive training data sets, until they can recognize patterns and identify abnormalities in data sets that are much bigger and messier than what humans can handle.

In addition to revolutionizing science, AI is now speaking to you on your smartphone, driving itself on the road, and unnerving futurists who fear it may result in widespread unemployment. Prospects for scientists are generally good because AI promises to speed up the research process.

Artificial intelligence in science: Examples, biology, physics and chemistry
Artificial intelligence in science: The amount of data generated by many of today’s physics and astronomy studies is so great that no human could possibly keep up

As the significance of AI in science grows, it is likely to become more important than ever to understand the mind inside the machine. Some innovators are already using AI to plan, execute, and interpret experiments, which opens the door to automated science. The diligent trainee might quickly advance to full-fledged colleague status.

Artificial intelligence in science: Biology

Today’s most interesting medical discoveries are being made at the intersection of biology and computer science, using techniques provided by artificial intelligence in science.

Despite initiatives to do so, there has been little progress in integrating studies from many branches of biology.


AI in agriculture: Computer vision and robots are being used for higher efficiency


We postulate that reintegrating biology will be made possible by upcoming generations of Artificial Intelligence (AI) technology tailored for biological sciences. We will be able to collect, link, and analyze data at previously unheard-of scales thanks to AI technology. We will also be able to create thorough prediction models that cut across numerous fields of study.

They will enable both targeted discoveries (testing particular hypotheses) and untargeted ones. Artificial intelligence in biology is the interdisciplinary technology that will improve our capacity to do biological research at all scales. In the same way that statistics revolutionized biology in the 20th century, we anticipate that AI will do the same for biology in the 21st.

Artificial intelligence in science: Examples, biology, physics and chemistry
Artificial intelligence in science: As the significance of AI in science grows, it is likely to become more important than ever to understand the mind inside the machine

The challenges, however, are numerous and include data collection and assembly, the creation of new science in the form of theories that link the various fields, and the development of new predictive and understandable AI models that are better suited to biology than current machine learning and AI techniques. Strong partnerships between biological and computational scientists will be necessary to advance development initiatives.

Artificial intelligence in science: Physics

Mathematical models were carefully written out and solved by hand in the early days of physics. Today, researchers can model and calculate complicated physics issues with a great deal more speed, accuracy, and originality than ever before, thanks to artificial intelligence in science. This post summarizes some of my favorite AI-related physics research projects.

Researchers have long been fascinated by questions about the nature of the universe. We now know more about other planets than we do about the deep ocean on our own. It has generated investment and interest in space research. There is still a lot of fundamental knowledge to acquire.


The quantum boost to AI paves the way for AGI


In a recent article, scientists explain how they used a neural network model to forecast the genesis of the universe’s structure. One of the “holy grails of modern astrophysics” is, in the words of the paper’s authors, “to fully understand the structure formation of the Universe.”

Artificial intelligence in science: Examples, biology, physics and chemistry
Artificial intelligence in science: Today, researchers can model and calculate complicated physics issues with a great deal more speed and accuracy

The researchers’ Deep Density Displacement Model (D3M), which uses deep learning to produce intricate 3D simulations in cosmology, was inspired by their desire to conquer this big unknown.

AI-driven frameworks are accelerating a wide range of important physics research topics. These innovations show the long-lasting influence AI is only now beginning to have on scientific discovery, from protein structures to climate modeling and gravitational wave detection to understanding the universe.

The use of AI to develop new models for tackling challenging physics problems has the potential to significantly accelerate scientific progress in the most fundamental areas of knowledge that explain and govern the world and cosmos in which we exist.

Artificial intelligence in science: Chemistry

One of the most often discussed topics in chemistry recently is artificial intelligence. Artificial intelligence and chemistry go hand in hand! The healthcare sector uses chemistry and artificial intelligence mostly for the development of new drugs.

Artificial intelligence in science: Examples, biology, physics and chemistry
Artificial intelligence in science: Drug development is not the only area in which AI utilized in science

The production and formulation of drugs have changed dramatically as a result of the fusion of technology and medicine. This procedure is also a result of increased research and development in the pharmaceutical industry due to the technologically advanced technology and equipment used by scientists. This is a key improvement regarding artificial intelligence in science field.


Chemists developed a new ML framework to improve catalysts


However, drug development is not the only area in which artificial intelligence in science. The building blocks of chemical bonds and molecules, which form the basis of science, are just the beginning. AI can assist with everything from molecule synthesis to molecular property identification when it comes to chemistry and related fields.

AI in science and research

Artificial intelligence facilitates many scientific processes, including research methods.

How is AI used in scientific research?

Below you can find some of the most interesting examples of AI in science and research field:

Protein structures can be predicted using genetic data

The function of a protein in the body can be understood by taking into account its form. Scientists are able to discover proteins that are involved in diseases, which helps with diagnostics and the creation of new medicines by foreseeing their structures.

Protein structure determination is a labor-intensive and technically challenging technique that has produced 100,000 or more known structures to date. The protein-folding issue is the difficult task of identifying the form of a protein from its matching genetic sequence, despite the fact that recent discoveries in genetics have produced huge datasets of DNA sequences.

Artificial intelligence in science: Examples, biology, physics and chemistry
Artificial intelligence in science: The function of a protein in the body can be understood by taking into account its form

Researchers are creating machine learning methods that can predict the three-dimensional structure of proteins from DNA sequences to aid in our understanding of this process. That is a great development regarding artificial intelligence in science field. For instance, the AlphaFold project at DeepMind has developed a deep neural network that forecasts the separations between pairs of amino acids and the angles between their bonds, producing a highly accurate overall prediction of a protein structure.

Recognizing how climate change affects cities and regions

The requirement to analyze vast volumes of collected data and simulate complicated systems is combined in environmental research. Predictions from global climate models need to be understood in terms of their effects on cities or regions in order to guide decision-making at the national or local level. For instance, forecasting the number of summer days where temperatures reach 30°C within a city in 20 years.

These small locations could have access to in-depth observational data about their environment, such as that provided by weather stations, but given the baseline changes brought on by climate change, it is challenging to make reliable estimates from this data alone.

Artificial intelligence in science: Examples, biology, physics and chemistry
Artificial intelligence in science: Predictions from global climate models need to be understood in terms of their effects on cities or regions in order to guide decision-making at the national or local level

The gap between these two forms of knowledge can be filled with the aid of machine learning. The resulting hybrid analysis would improve the climate models produced by conventional techniques of analysis and provide a complete picture of the local implications of climate change. It can merge the low-resolution outputs of climate models with detailed but local observational data.

Analyzing astronomical data

Large volumes of data are produced during astronomy research, making it difficult to separate the interesting features or signals from the background noise and classify them appropriately. For instance, the Kepler mission is gathering information from observations of the Orion Spur and beyond that may point to the existence of stars or planets in order to find Earth-sized planets circling other stars.

Artificial intelligence in science: Examples, biology, physics and chemistry
Artificial intelligence in science: Large volumes of data are produced during astronomy research, making it difficult to separate the interesting features or signals from the background noise and classify them appropriately

All of this information may be skewed by star activity changes, onboard thruster activity, or other systematic tendencies, so not all of it is helpful. These so-called instrumental artifacts must be eliminated from the system before the data can be analyzed. Researchers have created a machine learning system that can recognize and eliminate these artifacts from the system, cleaning it for future studies to assist with this.

AI in science examples

Identifying star and supernova features, classifying galaxies, and detecting new pulsars from existing data sets are just a few examples of how machine learning has been utilized to discover new celestial events.

Machine learning has emerged as a crucial tool for academics working in a variety of fields to analyze massive datasets, find patterns that were previously unnoticed, or derive surprising insights. While its prospective applications in scientific research span a wide variety of disciplines and will encompass a number of areas not specifically included here, here are some examples of study domains with emerging uses of AI.

Interpreting social history with archival data

The British Library’s National Newspaper archive contains millions of pages of out-of-copyright newspaper collections, and researchers are working with curators to create new software to analyze the data extracted from these collections. They will also make use of other historical collections that have been digitally preserved, particularly government-collected information from the Census and the registration of births, marriages, and deaths.

Artificial intelligence in science: Examples, biology, physics and chemistry
Artificial intelligence in science: Machine learning has emerged as a crucial tool for academics working in a variety of fields to analyze massive datasets

As a result, computational linguists and historians will be able to follow societal and cultural development during the Industrial Revolution as well as changes brought on by the advancement of technology in all spheres of society. Importantly, these new study techniques will put the lives of regular people in the spotlight. All thanks to artificial intelligence in science.

Using satellite images to aid in conservation

Because they only exist in the sea-ice zone, which is particularly challenging to survey, several species of seals in the Antarctic are very challenging to monitor. The expense and effort required to identify these seals in imagery have been significantly lowered because of the deployment of very high-resolution satellites.

Artificial intelligence in science: Examples, biology, physics and chemistry
Artificial intelligence in science: These new study techniques will put the lives of regular people in the spotlight

However, it takes a long time to manually count the seals over the wide area of ice that they live in, and different analysts report different counts. This issue might be resolved automatically using machine learning techniques, producing quick, reliable outcomes with known related errors.

Understanding complex organic chemistry

A pilot project with The Alan Turing Institute and The John Innes Centre aims to explore the potential of machine learning in modeling and forecasting the triterpene biosynthesis pathway in plants. Triterpenes are intricate molecules that make up a sizable and significant class of plant-based natural compounds with numerous commercial uses in the fields of health, agriculture, and industry.

Artificial intelligence in science: Examples, biology, physics and chemistry
Artificial intelligence in science: AI and data analytics are poised to revolutionize a wide range of industries

Over 20,000 structurally distinct triterpenes can be produced by customizing enzymes from a single common substrate, which is the starting point for the synthesis of all triterpenes. The ability to forecast the consequences of organic chemical reactions has recently shown promise. To make accurate predictions based on sequence, one needs to have a thorough grasp of the biosynthetic processes that result in triterpenes as well as cutting-edge machine learning techniques. This is possible thanks to artificial intelligence in science.

Conclusion

In conclusion, artificial intelligence and data analytics is poised to revolutionize a wide range of industries. Significant deployments have already changed decision-making, business models, risk mitigation, and system performance in the financial, national security, healthcare, criminal justice, transportation, and smart city sectors. Surely, artificial intelligence in science plays its own part. These changes are producing significant economic and social advantages.

]]>
https://dataconomy.ru/2022/11/09/artificial-intelligence-in-science-examples/feed/ 0
10 Rules for Creating Reproducible Results in Data Science https://dataconomy.ru/2017/07/03/10-rules-results-data-science/ https://dataconomy.ru/2017/07/03/10-rules-results-data-science/#respond Mon, 03 Jul 2017 09:00:57 +0000 https://dataconomy.ru/?p=18033 In recent years’ evidence has been mounting that points to a crisis in the reproducible results of scientific research. Reviews of papers in the fields of psychology and cancer biology found that only 40% and 10%, respectively, of the results, could be reproduced. Nature published the results of a survey of researchers in 2016 that reported: 52% […]]]>

In recent years’ evidence has been mounting that points to a crisis in the reproducible results of scientific research. Reviews of papers in the fields of psychology and cancer biology found that only 40% and 10%, respectively, of the results, could be reproduced.

Nature published the results of a survey of researchers in 2016 that reported:

  • 52% of researchers think there is a significant reproducibility crisis
  • 70% of scientists have tried but failed to reproduce another scientist’s experiments

In 2013, a team of researchers published a paper describing ten rules for reproducible computational research. These rules, if followed, should lead to more replicable results.

All data science is research. Just because it’s not published in an academic paper doesn’t alter the fact that we are attempting to draw insights from a jumbled mass of data. Hence, the ten rules in the paper should be of interest to any data scientist doing internal analyses.

10 Rules for Creating Reproducible Results in Data Science

Rule #1—For every result, keep track of how it was produced

It’s important to know the provenance of your results. Knowing how you went from the raw data to the conclusion allows you to:

  • defend the results
  • update the results if errors are found
  • reproduce the results when data is updated
  • submit your results for audit

If you use a programming language (R, Python, Julia, F#, etc) to script your analyses then the path taken should be clear—as long as you avoid any manual steps. Using “point and click” tools (such as Excel) makes it harder to track your steps as you’d need to describe a set of manual activities—which are difficult to both document and re-enact.

Rule #2—Avoid manual data manipulation steps

There may be a temptation to open data files in an editor and manually clean up a couple of formatting errors or remove an outlier. Also, modern operating systems make it easy to cut and paste been applications. However, the temptation to short-cut your scripting should be resisted. Manual data manipulation is hidden manipulation.

Rule #3—Archive the exact versions of all external programs used

Ideally, you would set up a virtual machine with all the software used to run your scripts. This allows you to snapshot your analysis ecosystem—making replication of your results trivial.

However, this is not always realistic. For example, if you are using a cloud service, or running your analyses on a big data cluster, it can be hard to circumscribe your entire environment for archiving. Also, the use of commercial tools might make it difficult to share such an environment with others.

At the very least you need to document the edition and version of all the software used—including the operating system. Minor changes to software can impact results.

Rule #4—Version control all custom scripts

A version control system, such as Git, should be used to track versions of your scripts. You should tag (snapshot) multiple scripts and reference that tag in any results you produce. If you then decide to change your scripts later, as you surely will, it will be possible to go back in time and obtain the exact scripts that were used to produce a given result.

Rule #5—Record all intermediate results, when possible in standardized formats

If you’ve adhered to Rule #1 it should be possible to recreate any results from the raw data. However, while this might be theoretically possibly, it may be practically limiting. Problems may include:

  • lack of resources to run results from scratch (e.g. if considerable cluster computing resources were used)
  • lack of licenses for some of the tools, if commercial tools were used
  • insufficient technical ability to use some of the tools

In these cases, it can be useful to start from a derived data set that is a few steps downstream from the raw data. Keeping these intermediate datasets (in CSV format, for example), provides more options to build on the analysis and can make it easier to identify where a problematic result when wrong—as there’s no need to redo everything.

Rule #6—For analyses that include randomness, note underlying random seeds

One thing that data scientists often fail to do is set the seed values for their analysis. This makes it impossible to exactly recreate machine learning studies. Many machine learning algorithms include a stochastic element and, while robust results might be statistically reproducible, there is nothing to compare with the warm glow of matching the exact numbers produced by someone else.

If you are using scripts and source code control your seed values can be set in your scripts.

Rule #7—Always store raw data behind plots

If you use a scripting/programming language your charts will often be automatically generated. However, if you are using a tool like Excel to draw your charts, make sure you save the underlying data. This allows the chart to be reproduced, but also allows a more detailed review of the data behind it.

Rule #8—Generate hierarchical analysis output, allowing layers of increasing detail to be inspected

As data scientists, our job is to summarize the data in some form. That is what drawing insights from data involves.

However, summarizing is also an easy way to misuse data so it’s important that interested parties can break out the summary into the individual data points. For each summary result, link to the data used to calculate the summary.

Rule #9—Connect textual statements to underlying results

At the end of the day, the results of data analysis are presented as words. And words are imprecise. The link between conclusions and the analysis can sometimes be difficult to pin down. As the report is often the most influential part of a study it’s essential that it can be linked back to the results and, because of Rule #1, all the way back to the raw data.

This can be achieved by adding footnotes to the text that reference files or URLs containing the specific data that led to the observation in the report. If you can’t make this link you probably haven’t documented all the steps sufficiently.

Rule #10—Provide public access to scripts, runs, and results

In commercial settings, it may not be appropriate to provide public access to all the data. However, it makes sense to provide access to others in your organization. Cloud-based source code control systems, such as Bitbucket and GitHub, allow the creation of private repositories that can be accessed by any authorized colleagues.

Many eyes improve the quality of analysis, so the more you can share, the better your analyses are likely to be.

 

Like this article? Subscribe to our weekly newsletter to never miss out!

]]>
https://dataconomy.ru/2017/07/03/10-rules-results-data-science/feed/ 0
The National Data Science Bowl Makes Plankton Research Sexy https://dataconomy.ru/2015/03/25/the-national-data-science-bowl-make-plankton-research-sexy/ https://dataconomy.ru/2015/03/25/the-national-data-science-bowl-make-plankton-research-sexy/#respond Wed, 25 Mar 2015 17:39:01 +0000 https://dataconomy.ru/?p=12500 Rare is the man whose heart starts racing upon hearing the word “plankton”. It might not be the sexiest field in the world, but the study and categorisation of this marine microorganisms can tell us a huge amount but the health of our oceans. The problem? There’s tonnes of them. The dataset of the Oregon State […]]]>

Rare is the man whose heart starts racing upon hearing the word “plankton”. It might not be the sexiest field in the world, but the study and categorisation of this marine microorganisms can tell us a huge amount but the health of our oceans. The problem? There’s tonnes of them. The dataset of the Oregon State University Hatfield Marine Science Center is the same size of 400,000 3-minute YouTube videos. Manually categorising all that data would take marine biologists 2 lifetimes to complete.

Of course, in our age, there’s a solution- machine learning-optimised automation. The Oregon State team handed their dataset over to the National Data Science Super Bowl, which challenged entrants to automate the ocean health assessment process. The entries to the contest- co-sponsored by Kaggle & Booz Allen Hamilton- were staggering, equivalent to $4 million worth of analytics research.

1,000 teams set to work on algorithms to classify over 100,000 images of plankton. The winning team, Ghent University’s imaginatively-titled “Team Deep Sea”, beat out over 15,000 other submissions with their model, which will save the marine research team years in categorisation time.

“We were originally drawn to this competition because of the vital social cause it supported,” said Team Deep Sea member Pieter Buteneers. “What we found was a truly life-changing opportunity to collaborate as a team and build something great together, and we are proud to have competed against such a high-caliber field of data scientists.”

Data Science Bowl Marine Biology Kaggle Deep Learning

“The quality of submissions and types of ideas being discussed by our community were truly amazing,” said Anthony Goldbloom, Kaggle’s founder and CEO. “The winning team used a cutting edge deep learning approach to create their winning model. Currently, even basic machine learning techniques are not widely used in the marine sciences and this competition has done a tremendous amount towards further exposing researchers in the field to its benefits.”

Although deep learning is hot (if perhaps hype-laden) field right now, many industries- including the wider marine biology community- are yet to see the benefits of these advanced techniques. Moving forward, Hatfield Marine Science Center Director Bob Cowen hopes to change that. “Our hope is that we will be able to expand upon this research and, eventually, make it an open source tool for the marine research community,” he said.

Photo credit: Phil’s 1stPix / Foter / CC BY-NC-SA

]]>
https://dataconomy.ru/2015/03/25/the-national-data-science-bowl-make-plankton-research-sexy/feed/ 0
Quantum Teleportation Covers More Miles with Latest Milestone at the University of Geneva https://dataconomy.ru/2014/09/26/quantum-teleportation-covers-more-miles-with-latest-milestone-at-the-university-of-geneva/ https://dataconomy.ru/2014/09/26/quantum-teleportation-covers-more-miles-with-latest-milestone-at-the-university-of-geneva/#respond Fri, 26 Sep 2014 08:38:17 +0000 https://dataconomy.ru/?p=9470 A European team of physicists working in the lab of Professor Nicolas Gisin in the physics department at the University of Geneva, has demonstrated a method that can teleport quantum information to a solid-state quantum memory over telecom fiber. This marks a crucial milestone towards the development of quantum Internet, for physicist Félix Bussières who […]]]>

A European team of physicists working in the lab of Professor Nicolas Gisin in the physics department at the University of Geneva, has demonstrated a method that can teleport quantum information to a solid-state quantum memory over telecom fiber.

This marks a crucial milestone towards the development of quantum Internet, for physicist Félix Bussières who has been heading the research group through a string of experiments carried out in the last decade in order to perfect quantum data transfer using entanglement.

What occurred was teleportation of the quantum state of a photon to another doped with rare-earth ions, placed at a distance of 25 kilometers, essentially transferring information from light to matter, over an ordinary telecom optical fiber in use all over the world. The previous record was of 6 kilometers (3.7 miles) set in 2003 by the same team.

To set things in perspective, Quantum teleportation is a process by which quantum data can be transmitted from one location to another, with the help of classical communication and quantum entanglement between the sending and receiving location, without traveling through the space in between. Matter itself doesn’t make this journey, only the information that describes it, explains the MIT Technology Review.

The researchers were able to generate entangled pairs of photons (suppose photon A and B) with different wavelengths, one of which passes easily through telecoms optical fiber(say photon B). So these guys send photon A, as a signal to the quantum memory (doped crystal) where it is stored, while transmitting photon B, through a fiber to another apparatus that prepares a third photon (also at the same wavelength as B, we call it X) with the polarization to be teleported.

This is when the teleportation takes place. When B and X photons interact in a certain way, the polarization is teleported to the quantum memory at the other end of the experiment, to A.

“The team’s measurements on these photons show that the polarization state is indeed teleported as quantum mechanics suggests. A crucial part of the experiment is a new generation of single photon detectors that can spot telecoms photons with much greater efficiency than has been possible before,” reports the MIT Technology Review.

This marks a small step towards the possibility of quantum internet, the kind of machinery that needs to be developed to make it happen.
The findings of the research were published in the journal Nature Photonics on September 21.

For further reading, see here.


(Image credit: University of Geneva)

]]>
https://dataconomy.ru/2014/09/26/quantum-teleportation-covers-more-miles-with-latest-milestone-at-the-university-of-geneva/feed/ 0