Archive for the ‘Basic Sciences’ Category

The Science of Networks

Monday, March 30th, 2015

Networks are everywhere. From the Internet to networks in economics, networks of disease transmission, etc, the imagery of the network pervades our modern culture. What exactly do we mean by a network? What different kinds of networks are there? And how does their presence affect the way that events happen? In the past few years, a diverse group of scientists and researchers, including mathematicians, physicists, computer scientists, sociologists, and biologists, have been actively pursuing these questions and building in the process the new research field of network theory, or the “science of networks”. The study of networks has had a long history in mathematics and natural sciences. Briefly, in 1736, the great mathematician Leonard Euler became interested in a mathematical riddle called the Königsberg Bridge Problem. The city of Königsberg was built on the banks of the Pregel River in what was then known as Prussia, and on two islands that lie in midstream. A popular brain-teaser of the time asked, “Does there exist any single path that crosses all seven bridges exactly once each?” Legend has it that the people of Königsberg spent many fruitless hours trying to find such a path before Euler proved the impossibility of its existence. In the 1780s, Euler invented network theory and for most of the last two hundred years, network theory remained a form of abstract mathematics. A network is made up of nodes and links and mathematicians assumed the links between the nodes were randomly distributed. If there exist, let’s say, 10 nodes and 50 links, they assumed the distribution would be random and each node would get, on average, five links. For years, mathematicians explored the properties of these random-distribution networks. Nowadays, we see the Internet as a source of networks: of people, of groups, of hashtags at Twitter, of social clusters at Facebook, etc. The Internet was originally designed by the American Military to be randomly distributed with no pattern in order to create a communications network that could survive an attack. In the 1990s, physicists began studying the internet because it was an example of a network in which all the nodes and links could be tracked. Computer scientists soon realized that the Web was not randomly distributed. Maps of the web showed that some nodes had huge numbers of links, while most nodes had only a few links. In biomedicine, the impacts of networks have just recently been tackled. In my article in 2013, on the cover of the journal Drug Discovery Today, I wrote that “Social networks can be seen as a nonlinear superposition of a multitude of complex connections between people where the nodes represent individuals and the links between them capture a variety of different social interactions. In addition, “…the emergence of different types of social networks has fostered connections between individuals, thus facilitating data exchange in a variety of fields.” (see my review article “Social networks, web-based tools and diseases: implications for biomedical research”). Networks of people and how to make sense of it are the hot wave today. A social network is a social structure made up of individuals (or organizations) called “nodes”, which are tied (connected) by one or more specific types of interdependency, such as friendship, kinship, common interest, financial exchange, dislike, sexual relationships, or relationships of beliefs, knowledge or prestige (for more information check “Social Network Analysis – Theory and Applications”). One can identify a person and the connections a specific person has, how influential he or she is and what social interactions a person has. In addition, the science of networks has been applied for businesses since companies that embed themselves into the social network of an industry by creating lots of contacts (links or nodes of the network) to other companies, suppliers, industry magazines, customers, government, and workers will have a tendency to grow, because the node with the most links will get more links. In life sciences, the science of networks transforms data collection into actionable information that will improve individual and population health, deliver effective therapies and, consequently, reduce the cost of healthcare. These novel tools might also have a direct impact in personalized medicine programs, since the adoption of new products by health care professionals in life sciences and peer-to-peer learning could be improved using social networks (see more at my article “The Impact of Online Networks and Big Data in Life Sciences”). Thus, the science of networks could also help the industry gain insights into how people use and react to pharmaceuticals and medical devices; and how they benefit from them. Such accumulation of information could be applied into the product development process as a “lean” process to test new products. The impact of the science of networks and social networking are immensurable in the scientific and biomedical communities. It is just the beginning for this area of research and I believe that there is a lot more to come. Welcome to the Networking Era!

Where is Science going?

Tuesday, January 29th, 2013

Science is changing fast and it is not hypothesis-driven anymore like it used to be. Any kind of research now faces increasing amounts of information and data to deal with. Fields such as astronomy, genomics, physics, drug discovery in biomedicine, and several others have been using Information Technology (IT) to analyze lots of data and make sense of it. We’ve got to a point that hypotheses are generated after you get the data from experiments. Then, high-throughput computer technology together with mathematical algorithms are used to answer questions. In other words, instead of generating data based on a specific hypothesis, you generate huge amounts of data and then ask the questions, thus formulating a hypothesis – it is backwards! This could be explained by the overlap we have been noticing of Information Technology with any kind of research. Well, we have always used computers for specific tasks, especially in research. But now computers are fast, the internet is even faster and we are creating an enormous gap. Science and young scientists (and I mean generally) are not prepared for this information overload named “Big Data”. One example is genomics, mainly because DNA sequencing machines are evolving in a pace that is leaving Moore’s Law in the dust (for more information see the article “Big Data in Genomics: challenges and solutions”). We are generating more data in the last years than we have produced in our entire existence. And the Big Data revolution is definitely impacting scientific and biomedical research. A specific example is the ENCODE project that is trying to map all functional regions in a person’s DNA (check the article “ENCODE: Big Data to deal with human complexity” for more information). As I mentioned before, science is facing an increasing deficit in people to not only handle big data, but more importantly to have the knowledge and skills to generate value from this data. How to aggregate and filter data, how to present the data, how to analyze it and gain insights, how to use the insights to aid in decision-making, and then how to integrate all the information is important for the future of science and scientific progress. The problem is that researchers need a toolbox of techniques, skills, processes and abilities to construct new solutions based on this accumulation of information. And they need the ability to create a user interface that turns their abstract findings into something others can understand. Scientists also need the skills to create elegant ways to transform raw data into information, and then investigate it. A Wired article put it all in a very nice essay: we are forgetting scientific theory and philosophy because of the Big Data. We are now giving lots of credit to computational power and are forgetting the main scientific ingredient that are human curiosity and instinct; and computers do not have these two ingredients. We’ve reached a point where supercomputers are fast enough to crunch data just as easily as anything else. This could be good or bad, depends on how we use this power. Time will tell, but for now, let me go back to my “hypothesis-generator”, or should I say my computer…Scientists have to work! (Image Credits: Nature Magazine)

ENCODE: Our Guide to the Human Genome

Wednesday, September 26th, 2012

Early in 2001, the Human Genome Project gave us a complete read out of our DNA. The elucidation of the human DNA sequence was important to give us all the instructions to make a human being; however we are just starting to realize how the instructions are indeed incomplete. Researchers were able able to uncover 3 billion letters of the DNA molecule, but just roughly 2% (around 25,000 protein-coding genes) corresponded to the building blocks (or proteins) of the cells. Based on that, many biologists suspected that the information responsible for the complexity in human cells could be somewhere in the “deserts” between the protein-coding genes. ENCODE, which stands for Encyclopedia of DNA Elements (for more information see the article “The Human Encyclopaedia” on Nature) is a project that started in 2003 with a massive data-collection effort uniting several laboratories all over the world. The main objective was to understand the deserts between protein-coding genes, catalogue the “functional” parts of the DNA sequence and understand their regulation. In summary, the objective of the ENCODE Project was to show if the rest of the genome, more specifically the non-coding areas, were doing something important inside the cells. An interesting fact from ENCODE is that it was a consortia created between groups that are generally competitors. These groups generated an incredible amount of information that was collected, stored and analyzed. Indeed, science today is increasingly “social”, especially in fields such as genomics in which huge amounts of data are generated. In such projects, collaboration between groups is key. This project was only possible because of collaboration. It was also a good training for researchers in big scientific projects that will be more common from now on. In these projects, tons of data are generated, stored, transferred and analyzed. After almost 10 years of intensive data analysis, researchers involved in this Project published their results in 30 papers across three different journals. According to ENCODE’ s main conclusions, more than 80 percent of our genome has a “biochemical function”. These regions were classified as “junk” for a long time, but ENCODE is showing that they are the opposite. Tom Gingeras, one of the study main scientists, declared that “Almost every single nucleotide is associated with a function of some sort or another in the genome” (see more in the DISCOVER Magazine), reinforcing the idea of functionality for most genome. And what about the remaining 20 percent of human’s DNA? Researchers believe that these remaining regions are not “junk” either. ENCODE looked at hundreds of cell types, but the human body has thousands. A given part of the genome might control a functional element in one cell type, but not in others, and the complexity of information could be even higher. Again, the researchers claim that ENCODE has one important implication, which is to redefine what is a “gene”. This new study has changed the view of the genome as we knew it since the functional elements have lots of overlap. And since we are the most complex organism out there, it is not surprising that the results are the same way. The new definition for a “gene” suggests that it is a collection of transcripts, united by a common factor, with a function that could be either in the genome itself or in biochemical reactions within a cell. Human genome research is far from finished, and this could go on for decades (if not forever…). For those who though that the elucidation of the sequence of the human DNA was enough to understand a human being, a big lesson has been learned. The complexity of the information from ENCODE will probably need another decade to be fully understood. In fact, ENCODE is just the start for a big journey inside human cell’s DNA. We are just beginning to build a guide for our genome (Image Source: Nature Journal).

Microbiome – the extension of our genome?

Wednesday, May 30th, 2012

After the sequencing of the human genome in 2001 and updated releases that came after that, researchers thought that this breakthrough would be enough to understand humans. It was the first step for a better understanding indeed. The scientific community also believed that we would be able to understand and cure complex diseases in less than a decade. A decade is gone and we still did not. There are two major barriers that could be the causes for these under achievements: 1) we still have several questions about the human genome sequence and there are regions that we do not completely understand until today and 2) there are other layers of information contained in human organisms. These new layers include the nasal passages, oral cavities, skin, gastrointestinal tract, and the urogenital tract, coming from microorganisms that have a kind of “symbiosis” with our organism. These are the good microbes that help we digest the food we eat, help our immune system fight diseases, amongst other functions that we are just uncovering. They are named microbiota and the totality of their genomes in humans is the microbiome. More than that, a microbiome is the totality of microbes, their genetic elements (genomes), and environmental interactions in a particular environment (in this case inside our bodies). The term “microbiome” was coined by Joshua Lederberg, who argued that microorganisms inhabiting the human body should be included as part of the human genome, because of their influence on human physiology. The human body contains about 100 trillion microbial cells representing over 10 times more microbial cells than human cells, which amazes me. It is already clear that the microbiota significantly affect human physiology. For example, in healthy individuals the microbiota provide a wide range of metabolic functions that humans lack (for more details see “Microbial Ecology of the Gastrointestinal Tract”). In diseased individuals, altered microbiota are associated with diseases such as neonatal necrotizing enterocolitis, inflammatory bowel disease (IBD), and other autoimmune disorders. Thus, studying the human microbiome is an important task that has been undertaken by initiatives such as the Human Microbiome Project and MetaHIT (for details see “A human gut microbial gene catalogue established by metagenomic sequencing”). The Human Microbiome Project (HMP) is a United States National Institutes of Health (NIH) initiative with the goal of identifying and characterizing the microorganisms, which are found in association with both healthy and diseased humans (their microbial flora). Launched in 2008, it is a five-year project, best characterized as a feasibility study, and has a total budget of $115 million dollars. The ultimate goal of this and similar NIH-sponsored microbiome projects are to test if changes in the human microbiome are associated with human health or disease. This is a very important project and could be considered a second “Human Genome Project”; however with several orders of magnitude in terms of complexity. In the last century, traditional microbiology has focused on the study of individual species as isolated units. However, the vast majority of microbial species have never been successfully isolated as viable specimens for analysis, presumably because their growth is dependent upon a specific microenvironment that has not been, or cannot be, reproduced experimentally. This project is collecting samples directly from people and there is no growing in laboratories, which is important. The HMP has already shown that, interestingly, the human microbiome is different in people that are obese or have autoimmune diseases such as IBD when compared to normal controls (for more information see “Metagenomic systems biology of the human gut microbiome reveals topological shifts associated with obesity and inflammatory bowel disease”). Another study has also shown that microbiomes could be continent and nation specific (see the article “Enterotypes of the human gut microbiome”). This indicates that the microbiota of these groups is different, and consequently, these differences could be important for our understanding of pathological states and also the development of new drugs against diseases. So I guess we could say that our microbiomes, which represent the genome of all microorganisms in our body, are indeed an extension of our genomes. This means that we have more complexity to deal with and more deep science will be needed to fully understand humans in the years to come! (Image Credits: Dribbble)