Interfacing Scientific Research Products

Materials research science has several concurrent goals varying degrees of importance to the agents in the collaboration. In research science, we produce Software, Data, Published Materials, and Teaching tools. All of these components are critical to research science and asserting their relative importance is an ill-posed question.

Software

Software are the codes and raw language that transform information into knowledge. Software includes OEM software, open source software, enterprise software, homegrown software, etc. A well documented software process is critical to reproducible research. Sometimes the research process may yield reusable codes for internal group use or a community at large.

Data

Data is the general currency of materials research. Materials scientists work directly on or abstract the physical world to understand why the pervasive materials world behaves as it does. Materials data can come from everywhere and in everyway possible. Materials Data may be the main product of a thrust or as simple as an input into data science problems.

Publication

Research science typically culminates in knowledge transferred through peer reviewed publication. The research process itself is driven by knowledge published in alternate sources like email, presentations, and internal documents. Publication is a means to filter Big Data research into its apparent salient knowledge. Knowledge is a distillation of information that carries a lower transaction cost; Big Data problems can rarely be condensed into a single publication thereby the community is left with a coarse description of the underlying information. The information presented is biased by the intentions of the authors rather than the community. There is a lot of published content during the research process that could be critical metadata, but that information is undiscoverable.

Disparities in Publication Models

Materials science relies heavily on peer reviewed publication. Journal articles and their impact factors quantify the importance of the work. Meanwhile, in Computer Science and Electrical Engineering, conference papers seem to contribute to the bulk of research publication. Peer Reviewed conferences are more frequent and shorter articles. Research papers in the scope of a conference are much smaller, more focused, and less tedious to grade. Peer reviewed journal articles are still important in these communities, but they tend to provide an overarching connection of the conference papers.

Teaching Tools

Materials Genome Initiatve

There are few critical drivers in the Materials Genome Initiative; I will focus on YOU: The Next Generation Workforce. The world of digitally integrated materials science is in its nascent stages. There is a lot of uncertainty with the entire direction of the Initiative. I believe that the generation arriving on the doorsteps of the MGI will be enormous drivers in its maturity. I hope to use this course, your experiences, and your patience to develop new learning tools for the emerging MGI community.

A Balancing Act

Important By-products of Research Science

The goals of agents in a research collaboration are not created equal. There are many consequences to research and each contributer has a role and talent. Elements of the research science spectrum have a value external to the collaboration itself. The science should contribute to the public good. Managing Big Data and modern research science could stand to inject a more democratic process into the dissemination of its research products.

As a researcher, we are rarely aware of the entire scope of our works applicability.

Critical Time to Publication

Time is not on your side

Lags in communication and information access have a dramatic impact on both yourself and your collaborators. Developing products and deploying scientific research can be drastically increased by simply streamlining your communications with your collaborators. Streamlining communication creates less waste on your end and eases your collaborators energy expended in responding or collaborating with you.

Rapid product and research deployment requires moving information and communications along asynchroneously. At the same time it behooves researchers to centralize their research end-products by optimizing their innovation stack. Centralizing Software, Data, Publications, and Teaching tools will minimize many non-technical workflow redundancies.

Research science must begin to embrace better means of asynchroneous communication that is inspired by the complex relationship between the research end-products. A collaborative research stack will efficiently manage discussions and comments that provide critical metadata to software, data, publications, and teaching tools.

What are some ideas that you could use to streamline information access? How can we minimize the work in transferring information?

Workflow Redundancies and Pipelines

Delivering information to your collaborators promptly is critical to a fluid collaboration. Often times trivial tasks seem heruclean because the wrong software is being used. Improving the speed of information access reduces overall non-technical work in a collaboration because it directly reduces the work that each agent in the collaboration needs to expend.

In So you want to be a computational biologist? the authors suggest Pipelineitis is a nasty disease. Futher they offer a warning to researchers that states:

Warning: don’t pipeline too early. Get a method working before you turn it into a pipeline. And even then, does it need to be a pipeline? Have you saved time? Is your pipeline really of use to others? If those steps are only ever going to be run by you, then a simple script will suffice and any attempts at pipelining will simply waste time. Similarly, if those steps will only ever be run once, just run them once, document the fact you did so and move on.

Sometimes Research Falls of the Rails

Montparnasse derailment

More often than not researchers must wrestle with their misconceptions in the direction of their research rather than sip champagne in admiration. Research failures or poor hypotheses can be circumvented with a willingness and openness to publish using modern web tools. A blog aware approach to research science if you will..

Peer review can be organic using web publishing by asking colleagues and the community to assess intermediate progress. Injecting colleagues opinions into the research process could potentially reduce unneeded effort and other waste.

How does rapid publishing impact time?

Science as a Startup

Why are startups failing so badly everywhere we look? The first problem is the allure of a good plan, a solid strategy, and thorough market research. In earlier eras, these things were indicators of likely success. The overwhelming temptation is to apply them to startups too, but this doesn’t work, because startups operate with too much uncertainty. Startups do not yet know who their customer is or what their product should be. As the world becomes more uncertain, it gets harder and harder to predict the future. The old management methods are not up to the task. Planning and forecasting are only accurate when based on a long, stable operating history and a relatively static environment. Startups have neither. Excerpt from the Lean Start Up by Eric Ries

Science is a Startup; Science progresses under extreme uncertainty; Science is innovation. Successful innovation requires ideas to be agile and adaptable. Modern web publishing can strongly impact innovation in research science by provided a more rapid means to sharing information than peer review publication with embedded comments.

Wrap Up

The NEW materials science is more uncertain than ever before; the current pool of knowledge is disparate and unassimilated. Moving forward Materials Informatics and Materials Genome efforts must be directed at innovation and learning. Non-materials scientists are going to be your direct colleagues. A (semi)open dialogue will allow researchers to navigate the undefined products of research science before getting caught in time warps.