The exponential growth in research information can drive discovery and scientific advancements in the aggregate, but it can also have negative effects on your researchers' and organization’s efficiency, as they search for the proverbial needle in the now hay-skyscraper.
The volume of scientific data is now doubling every two years. In fact, it's predicted that the entire digital universe will reach 44 trillion gigabytes by 2020. That's a good thing, right? Not always. Researchers know that scientific information is used differently than the quick trips to Google or Baidu we all perform for everyday needs. Scientific information must be evaluated by the searcher and judgments must be made about whether a document is relevant or worthy of further study. The entire research process and downstream intellectual property claims will rely on this early investigation.
For every lab project, it's not uncommon for researchers to search for, evaluate, read and internalize hundreds if not thousands of papers, patents, reviews, conference proceedings, etc. As the data universe grows, this time commitment has become a more significant resource investment than you may realize. Leaders must have a plan to account for this trend.
The challenges of having too much information
Risk: Decreased research efficiency
CAS regularly conducts research on scientists’ information search activity and behavior. This work is done across scientific disciplines and with hundreds of thousands of researchers worldwide. The results are alarming! On average, seven hours a week is spent online searching for information—nearly one full working day per week worth of searching. If you think your researchers are on the internet every time you walk by their desks, it's probably because they are. And nearly 4 of those hours are spent simply scanning results. Unfortunately, CAS' research shows this challenge could get much worse if you don’t have a strategy and the resources to deal with it.
Compounding this issue is the growing complexity of the information itself. As you can see in the figure below, the average number of unique concepts (a measure of document richness or complexity) CAS scientists have identified per patent has almost tripled over the past 10 years.

In short, more documents to review, more research to evaluate and more time spent comprehending—all at the cost of time spent on the innovation itself.
This is the central challenge that now drives CAS innovation and has led to solutions such as SciFindern, which leverages the most advanced chemically intelligent relevance engine available. Cheminformatics solutions must do more than retrieve hits, count references and offer intuitive interfaces—they must help.
For this reason, CAS is also actively working in areas such as predictive synthesis, big data analytics, machine learning and data modelling. This is in addition to offering the highest-quality and most comprehensive data source in the industry for the past century.
Risk: Increased decision-making impact
In science, the past forms a foundation for today's breakthroughs. Thus, it is crucial that the information identified and used is reliable, otherwise time can be wasted and reputations damaged. Filtering out the noise has never been more important to your organization's research efficiency. There are a number of factors, beyond the volume of information being generated, that makes effective information searching an increasing challenge.
First is the proliferation of scientific journals and other global information sources. With many new publications emerging, it is difficult to know which have the highest-quality information and which should be used with caution. Some documents and research processes include rigorous peer review or application processes, while others receive little oversight.
A second factor is economic. CAS has noted organizations or economies that incentivize scientists to publish. One example is China’s Patent Promotion Policies, under which government has been financially rewarding scientists who are granted patents. This policy has dramatically increased the number of approved patents and China is now the number one country in terms of the quantity of patent applications filed. Incentives drive behavior and can drive important discovery, but it can also produce rushed work. Increases in new patents have been offset by an increase in patent withdrawals and a decrease in renewals.
This issue also dramatically affects your organization's ability to accurately identify areas of open white space in which you can undertake novel research. Inaccurate or faulty IP claims lead research teams to quickly abandon a good opportunity by the appearance of a crowded landscape that upon further inspection still offers viable opportunities.
At CAS, we have always relied on a rigorous process for selecting sources to include in our content collection. Our team of expert scientists, speaking more than 50 languages, reviews new publications and data sources to assess quality and relevance, taking that burden off users of our solutions. Furthermore, we continue to leverage human expertise to carefully curate and enrich our content collection by identifying and connecting key ideas, substances, reactions and properties. When errors are uncovered, we can often contact the authors or publishers directly. In this way, we use our unique position to ensure you have a source of rigorously vetted, high-quality data that you can rely on with confidence.
CAS, your partner in tackling the scientific information haystack
As the exponential growth of published knowledge continues to accelerate, your team's ability to find the needles in scientific haystacks is a critical competitive advantage. This is the fundamental challenge that drives CAS—where science and strategy converge.
Learn more about our scientific information solutions and how CAS can help you stay on top of the information haystack.