Poster on our recent research on symmetries and symmetry breaking in biological networks

Poster on our recent research on symmetries and symmetry breaking in biological networks

Isaac Newton was a successful investor, but he lost a fortune (£15m in today’s money) in the South Sea Bubble. When asked about his misadventure, he supposedly replied that he ‘could calculate the motions of the heavenly stars, but not the madness of people’ (presumably, himself included).

If you’re looking to learn Python, with applications to neural data analysis, check out the link.

Poster on our recent research on symmetries and symmetry breaking in biological networks

A Spectacular 1986 Goal by Maradona Still Inspires Debate on Bank of England Policy

On the $100 million U.S. project to determine the DNA changes that drive nine forms of cancer: It is “not likely to produce the truly breakthrough drugs that we now so desperately need,” Watson argued. On the idea that antioxidants such as those in colorful berries fight cancer: “The time has come to seriously ask whether antioxidant use much more likely causes than prevents cancer.”

Can Hermaphrodites Teach Us What It Means To Be Male?, an article written by The Evolution Institute onCaenorhabditis elegans, related to the work that we are doing in our lab.

Below a list of articles that discuss how biologists see equations on papers:

T. Fawcett & A. Higginson.Heavy use of equations impedes communication among biologists,PNAS(2012).A. Fernandez. No evidence that equations cause impeded communication among biologistists,PNAS(2012).N. Chitnis & T. Smith.Mathematical illiteracy impedes progress in biology,PNAS(2012).J. Gibbons. Do not throw equations out with the theory bathwater,PNAS(2012).A. Kane. A suggestion on improving mathematically heavy papers,PNAS(2012).

Visionary entrepreneurs generally “have mental health profiles that are associated with higher levels of creativity, higher levels of energy, higher levels of risk tolerance and higher levels of impulsivity. Another way to look at impulsivity is a need for speed, a sense of urgency, higher motivation, and greater restlessness.”Elon Musk

Our Lab advances the objectives of five of the 10 Big Ideas for Future NSF Investments:

- “Harnessing Data for 21st Century Science and Engineering”: Big-Data Science
- “Understanding the Rules of Life: Predicting Phenotype”: From Structure to Function
- “Work at the Human-Technology Frontier: Shaping the Future”: Artificial Intelligence
- “Growing Convergent Research at NSF”: Interdisciplinary Research
- “NSF INCLUDES: Enhancing Science and Engineering through Diversity”: Diversity

This document describes all of the ten big ideas that will push forward the frontiers of research across all NSF-funded fields.

Please find here a list of important papers compiled by the graduate students in this lab. These papers range from introductory to technical, general to project-specific, so you should be able to get a good idea of the types of research we are conducting.

Thinking of learning R or Python? Please see the following compilation of resources.

**The following is a list of software that are used in our lab to analyze networks and data. Students and postdocs should be (or become) familiar with these methods.****Low Expertise Required:**MATLAB — generally used for data cleaning, data analysis, calculations, plot generation, etc.; you can get as simple or complicated as you want with it

Complex Networks toolbox —MATLAB toolbox by Lev Muchnik for analysis of complex networks; includes a k-shell decomposition algorithm

Machine learning toolbox — MATLAB toolbox for (basic) machine learning

Community detection/modularity algorithm — MATLAB code used to find community structure and modularity of a network

Network attributes algorithm — MATLAB code used to find network components, sizes, and lists of member nodes

Python — another general-use platform; again, uses can range in complexity

NetworkX — Python library used to find basic attributes of a network, such as the degree distribution

graph-tool — Python library for fast component decomposition, finding modularity, large network visualization

pandas — Python library used for data management

NumPy — Python library used for vector and matrix operations

SciPy — Python library for statistics, hypothesis testing, regression, and numerical computation

Beautiful Soup — Python library used for website scraping

Scikit-learn — Python library used for basic machine learning methods, including GLasso and stochastic gradient descent

ImageJ — Java image processing program used for optical CT imaging analysis

Gephi — visualization and analysis software for networks ***CAN BE BUGGY — SAVE WORK OFTEN***

Pajek — general network visualization software**Low/medium Expertise Required:**

SQLite — used for Twitter data management and analysis

**Medium Expertise Required:**Graphical Lasso (GLasso) algorithm — MATLAB code used to find a sparse inverse correlation matrix

Collective Influence algorithm — C code implementation of Collective Influence algorithm; can be downloaded on the Software page

Monte Carlo for Maximum Entropy XY model — C code to find interaction matrix for network which can be modelled via a Maximum Entropy XY model ***BEST FOR VERY SMALL NETWORKS***

FMRIB Software Library (FSL) — used for model-based FMRI analysis (FEAT) and modelling the brain (BET)

BrainNet Viewer — brain network visualization software

**Medium/high Expertise Required:**Medical imaging toolbox — MATLAB toolbox specifically for medical imaging

Natural Language Toolkit — platform for building Python code used in natural language processing (e.g., on Twitter)**High Expertise Required:**TensorFlow — used for Deep Learning development in machine learning

**For computer analysis you will need:**Anaconda for Python 3.6

Gephi (link is above)

A Twitter accountFor an introduction to Twitter network analysis, please see the following tutorial by postdoc Alexandre Bovet.

You can find further videos and tutorials pertinent to our research here, courtesy of the NIPS conference.

**The courses below will allow you to analyze Big Data in a variety of circumstances ranging from systems biology, to ecology, to social networks and finance:****Complex Networks**at the Graduate Center – Physics – PHYS85200 – CRN 23395 – Professor H. Makse

This is my course on Network Theory; please see the syllabus.**Machine Learning**at the Graduate Center – Computer Science – CSC74020 – Professor R. Haralick or Professor C. Yuan

Professor Haralick focuses more on the theoretical aspect while Professor Yuan focuses more on Natural Language Processing.**Big Data Analysis: Principles and Methods**at the Graduate Center – Physics – PHYS85200 – CRN 32250 – Professor G. Patz

More application than theory, this course is a good introduction to the topic.**Finance for Scientists**at the Graduate Center – Physics – PHYS85200 – CRN 30235 – Professor T. Schäfer

This course provides a good mathematical background on stochastic processes.**Computational Methods in Physics**at the Graduate Center – Physics – PHYS85200 – CRN 23394 – Professor A. Poje

Ideal for those who have some experience in programming but want to become more comfortable with applications such as Monte Carlo methods.**The following courses cover theoretical principles important to the core of our research program, and in fact, the first two are mandatory for first-year Ph.D. students at the Graduate Center:****Statistical Mechanics**at the Graduate Center – Physics – PHYS74100**Mathematical Methods in Physics**at the Graduate Center – Physics – PHYS70100**Quantum Information Theory**at the Graduate Center – Physics – PHYS85200**Quantum Theory of Fields I & II**at the Graduate Center – Physics – PHYS82500 and PHYS82600, respectively**There are also courses outside the CUNY system, which I suggest that you look into if you have time. New York University has a Center for Data Science, as does Columbia University. Some examples of online courses offered are:****Computational Physics**– PHYS-GA-2000**Non-equilibrium Statistical Physics**– PHYS-GA-2061**Online courses are also important to our field of study:**Deep Learning is an important subject for any data scientist to know, although there is no course currently offered in the CUNY system. My students are self-taught or take online courses.

If you are learning the Python programming language (the language for Data Science), the Python Data Science Handbook is a very useful resource, as are Python courses that can be found at Coursera or edX.

For Data Science, Machine Learning, and Big Data Analysis, most of my students use Python, MATLAB, C, C++, Mathematica, and other languages. Please see

**“For prospective students and postdocs: Software”**for further details.There are also a great many online courses on applications of Data Science that can be found here. They are mostly (if not all) free, and range in difficulty level from introductory, like

**“Introduction to Python for Data Science,”**to advanced, like**“Case Studies in Functional Genomics.”**There is even, at the time of this writing, an introductory course in the application of Data Analysis to biological systems, called**“Introduction to Bio: Annotation and Analysis of Genomes and Genomic Assays.”**The above are a sampling of what my students found online, so you can also look into it further.