This Malawi movie “The boy who harnessed the wind” blew my mind. As a scientist of the future, it’s always hard to explain in simple terms what a scientific mind does. In school, we go and learn all those physics, chemistry, math, calculus, draw all those complex looking pictures and then we forget what all these fucking science jargons are and what they are for. A lot of us even start hating science as they don’t find it relevant or their lack of understanding frightens them. And for a lot, it goes against their own agenda! But it’s all about our survival on earth and our progressive understanding of how nature works- we sometimes forget. Science provides the best evidence based tool that has transformed our society. This little boy saved the life of hundreds in his small village by creating a simple wind mill where people were dying, starving , stealing and killing each other for food. Nature didn’t pour rain upon them for years and they were not being able to grow crops. He was expelled from the school because his dad couldn’t pay for the fees. But he was running through the wind, he knew the power of the wind, he discovered how his science teacher’s bicycle has a light which is powered by the manual rotation of the bicycle wheel using a Dynamo. He then could use a fan to power a radio, he could then convince his dad to scrape their only bicycle to make a windmill that saved lives of the villagers. Social entrepreneurship. But against everybody’s will, against everybody’s judgment. He experimented in hand, he sneaked into the school library and learnt the science. In the core, he wanted to solve a problem. The problem of hunger. And it saddens me when thousands of people mindlessly cut trees, take actions for their own selfish greed and lead this and the next generation to disaster. We are seeing all with climate deniers, flat Earth believers, with the science of genetics and so many others. Most people go against something new that they haven’t seen before, they are bad at anticipating. It takes few scientific minds but I wish people were more open to simply explore the new.

# Tag: science

## Methods in Fourier Spectral Analysis

## Fourier Transform

It was a significant discovery in mathematics that any function can be expanded as a sum of harmonic functions (sines and cosines) and the resulting expression is known as Fourier series. A harmonic of repeating signals such as sunusoidal wave is a wave with a frequency that is a positive integer multiple of the frequency of the original wave, known as the fundamental frequency. The original wave is called the first harmonic, the following harmonics are known as higher harmonics. Any function can also be expanded in terms of polynomials and the resulting expression is known as Taylor series. If the underlying forces are harmonic and there possibly exists some periodicity, then the use of harmonic series is more useful than using polynomials as it produces simpler equations. It is possible to discover a few dominating terms from such series expansion which may help identify the known natural forces with the same period.

Let the symbol represent a continuous function of time. The Fourier transform is a function of

frequency .

The amplitude and the phases of the sine waves can be found from this result. Given data , we can find the Fourier transform using Inverse Fourier transform.

The spectral power is defined as the square of the Fourier amplitude:

However, real data does not span infinite time and most likely be sampled only at a few discrete points over time. Suppose that, we received values of at times , then an estimate of the Fourier transform is made by using summation. The inverse transform is also shown using the summation.

The data are desired to be sampled from equally spaced time as nice statistical properties are available in such regular case. If the interval between equally spaced data points is , then the highest frequency that will appear in the fourier transform is given by the Nyquist-Shannon sampling theorem. The theorem states “If a function contains no frequencies higher than Hz, then it is completely determined by giving its ordinates at a series of points spaced seconds apart”. Therefore, the Nyquist frequency (highest frequency) is given by the following equation.

The lowest frequency is the one that gives one full cycle in the time interval . The other frequencies to evaluate is the multiples () of the low frequency . And, also we can derive the symmetric pair of equations. Moreover, if is band-limited (no frequencies below or above ), then there is a relationship between the continuous function and the discrete values .

(when band limited)

## Periodogram

Fourier transform give us the complex numbers and the square of the absolute value of these numbers represent the periodogram. This is the first form of numerical spectral analysis and is used to estimate spectral power. Even though the data points collected are at evenly spaced specific discrete time, it is possible to evaluate periodogram at any frequencies.

## Fast Fourier Transform (FFT)

We can calculate the Fourier transform very efficiently by using FFT. It requires data at equally-spaced time points, and is most efficient when the number of points is an exact power of two. Interpolation is often used to produce the evenly-spaced data which may introduce additional bias and systematic eror. For real data consisting of data points , each taken at time , the power spectrum outputs a set of data points. The first and the last data points are the same, and they represent the power at frequency zero. The second through to the data points represent the power at evenly-spaced frequencies up to the Nyquist frequency. The spectral power for a given frequency is distributed over several frequency bins, therefore an optimum determination of the power requires combining these information and proper investigation of leakage. FFT, generally, calculates the amplitude for a set of frequencies. N/2 complex amplitudes are calculated at N/2 different frequencies. Because, these may not be the true frequencies present in the record, we subtract the mean from the data and then pad it with zeros to overcome this challenge.

## Aliasing

The time series consists of measurements made at a discrete, equally spaced, set of times on some phenomenon that is actually evolving continuously, or at least on a much finer time scale. For example, samples of Greenland Ice represent the temperature every 100 years, but if the sampling is not precisely spaced by a year, we will sometimes measure winter ice, and other times measure summer ice. Even without the existence of long-term variation in the temperature, fluctuations (jumping up and down) in the data can be noticed. So, there can be frequencies higher than the Nyquist frequency associated with the sampling interval. Thus a peak in the true spectrum at a frequncy beyond the Nyquist frequency may be strong enough to be seen(aliased) in the spectrum which may give the impression that a frequency is significant when it is not. Or, a peak may partly obscure another frequency of interest. This phenomenon is known as aliasing.

## Tapering

Fourier transform is defined for a function on a finite interval and the function needs to be periodic. But with the real data set, this requirment is not met as the data end suddenly at t=0 and t=T and can have discontinuities. This discontinuity introduces distortions (known as Gibbs phenomenon) in fourier transform and generates false high frequency in the spectrum. Tapering (using data window) is used to reduce these artificial presence. The data is multiplied by a taper function which is a simple, slowly varying function, often going towards zero

near the edges. Some of the popular tapers are:

1. Sine taper

2. Hanning (offset cosine) taper

3. Hamming taper

4. Parzen or Bartlett (triangle) window

5. Welch (parabolic) window

6. Daniell (untapered or rectangular) window

The frequency resolution in the spectrum of the tapered data is degraded. If the primary interest is the resolution of peaks, then the untapered periodogram is superior. However, tapering significantly reduces the sidelobes and also the bias applied to other nearby peaks by the sidelobes of a strong peak. Because, the taper functions are broad and slowly varying and their fourier transform FT(g) are narrow. The effect of tapering the data is to convolve the fourier transform of the data with the narrow fourier transform of the taper function which amounts to smoothing the spectral values.

<

p style=”text-align:justify;”>

// Sine taper

t <- seq(0,1, by=0.01)

T <- 1

g <- sin(pi * t * T)

plot(t, g, t='l', col=1, ylab='g(t)')

// Hanning (offset cosine) taper

g2 <- 1/2 * (1-cos(2*pi*t/T))

lines(t, g2, t=’l’, col=2)

// Hamming

g3 <- 0.54 – 0.46 * cos(2*pi*t/T)

lines(t, g3, t=’l’, col=3)

// Parzen or Bartlett (triangle) window

g4 0.5, 1 – (t-T/2)/(T/2), 2*t)

lines(t, g4, t=’l’, col=4)

// Welch (parabolic) window

g5 <- 1 – (t-T/2)^2/(T/2)^2

lines(t, g5, t=’l’, col=5)

// Daniell window

g6 <- rep(0.5, length(t))

g6 <- ifelse(t <= 0.2, 0, g6)

g6 = 0.8, 0, g6)

lines(t, g6, t=’l’, col=6)

legnd = c(‘Sine’, ‘Hanning’, ‘Hamming’, ‘Bartlett’, ‘Welch’, ‘Daniell(20%)’)

legend(‘topleft’, legend=legnd ,col=1:6, lty=1, cex=0.75)

## Multitaper Analysis

We apply taper or data window to reduce the side lobes of the spectral lines. Basically we want to minimize the leakage of power from the strong peaks to other frequencies. In multitaper method, several different tapers are applied to the data and the resulting powers then averaged. Each data taper is multiplied element-wise by the signal to provide a windowed trial from which one estimates the power at each component frequency. As each taper is pairwise orthogonal to all other tapers, the windowed signals provide statistically independent estimates of the underlying spectrum. The final spectrum is obtained by averaging over all the tapered spectra. D. Thomson chose the Slepian or discrete prolate spheroidal sequences as tapers since these vectors are mutually orthogonal and possess desirable spectral concentration properties. Multitaper method can suppress sidelobes but have higher resolution. If we use few tapers, the resolution won’t be degraded, but then sidelobe reduction won’t happen much. So, there is a trade-off which is often misunderstood.

## Blackman-Tuckey Method

Blackman and Tuckey prescribed some techniques to analyze a continuous spectrum that was biased by the presence of sidelobes of strong peaks in the ordinary periodogram. Blackman-Tuckey(BT) method was developed before 1958, prior to the FFT(Fast Fourier Transform) method. A discrete fourier transform of N points would

require the calculation of sines and cosines. With the slower computer in the pre-FFT days, the calculation of fourier transform was thus expensive. BT method has reduced the time by reducing the size of the dataset by a factor of the lag in the autocorrelation calculation. BT method is based on a fundamental theorem of Fourier transform that the Fourier transform of a correlation is equal to the product of the Fourier transforms. The correlation of two functions and is given by the first equation below.

When g = h, it is called Wiener-Khintchine theorem. Here, P is the spectral power.

The algorithm in BT method calculates partial autocorrelation function, defined by

Here, N is the length of the data set but we integrate only upto . $l$ is associated with the lag. When (recommended by Blackman and Tuckey) is used, we say that “a lag of 1/3” is used. Now the fourier transform of partial autocorrelation function gives us the spectral power. Moreover, the symmetric property of the partial autocorrelation function saves half of the computation time.

If , then it is basically the full autocorrelation function and gives the same answer as the ordinary periodogram.

Because we are using partial correlation function instead of the full correlation, the spectral power function gets smoother. Therefore, we lose resolution in the BT method. However, it averages the sidelobes into the main peak, and thereby gives a better estimate of the true power. The smoothing in BT method is different from the smoothing when we use a taper. With a taper, the fourier transform is smoothed, where as with Blackman-Tukey, it is the spectral power which gets smoothed. A spectral amplitude that is rapidly varying will be averaged to zero with a taper. But in BT method, a rapidly varying amplitude does not necessarily average to zero, since the process of squaring can make the function positive over the region of smoothing. The tapering does not

average the sidelobes into the main peak. Because, shift in the time scale behaves like phase modulation. The sidelobes, when tapering is applied, will not have the same phase, and if averaged in amplitude, they can reduce the strength of the peaks. A major challenge in the BT method is that we will have to estimate the proper lag to use before doing all the calculations. Blackman and Tukey recommended starting with the value 1/3 for the lag.

## Lomb-Scargle Periodogram

The classic periodogram requires evenly spaced data, but we frequently encounter with unevenly spaced data in paleoclimatic research. Lomb and Scargle showed that if the cosine and sine coefficients are normalized separately, then the classic periodogram can be used with unevenly spaced data. If we have a data set , we first calculate the mean and variance:

For every frequency f, a time constant is defined by

Then the Lomb-Scargle periodogram of the spectral power at frequency f is given by

$P(f) = \frac{1}{2\sigma^2}\frac{ \sum_{k=1}^{N}(y_k – \bar{y} ) [cos(2\pi f (t_k-\tau))]^2}{\sum_{k=1}^{N}cos^2(2\pi f (t_k-\tau))} +

\frac{ \sum_{k=1}^{N}(y_k – \bar{y} ) [sin(2\pi f (t_k-\tau))]^2}{\sum_{k=1}^{N}cos^2(2\pi f (t_k-\tau))}$

With evenly spaced data, two signals of different frequencies can have identical values which is known as Aliasing. That is why the classic periodogram is usually shown with the frequency range from 0 to 0.5, as the rest is a mirrored version. But with Lomb-Scargle periodogram, the aliasing effect can be significantly reduced.

## Maximum Likelihood Analysis

In maximum likelihood method, we adjust the parameter of the model and ultimately find the parameters with which our model have the maximum probability/likelihood of generating the data. To estimate the spectral power, we first select a false alarm probability and calculate the normalized periodogram. We identify the maximum peak and test it against the false alarm probability. If the maximum peak meets the false alarm test, we determine the amplitude and phase of the sinusoid representing the peak. Then we subtract the sinusoidal curve from the data which also removes the annoying sidelobes associated with that peak. After peak removal, the variance in the total record is also reduced. Now, with the new subtracted data, we continue finding the other stronger peaks following the same procedure. We stop when a peak does not meet the false alarm test. We need to carefully choose the false alarm probability, as if it is too low, we can miss some significant peaks; it is too low, we can mislabel noise as peaks.

## Maximum Entropy Method

It is assumed that the true power spectrum can be approximated by an equation which has a power series. This method finds the spectrum which is closest to white noise (has the maximum randomness or “entropy”) while still having an autocorrelation function that agrees with the measured values – in the range for which there are measured values. It yields narrower spectral lines. This method is suitable for relatively smooth spectra. With noisy input functions, if very high order is chosen, there may occur spurious peaks. This method should be used in conjuction with other conservative methods, like periodograms, to choose the correct model order and to avoid getting false peaks.

## Cross Spectrum and Coherency

If a climate proxy is influenced or dominated by a driving force , we can use cross spectrum to see if their amplitudes are similar. Cross spectrum is given by the product of the fourier transform.

where A is the Fourier transform of a and B is the complex conjugate of the fourier transform of b. If we want to know whether two signals are in phase with each other, regardless of amplitude, then we can take the cross spectrum, square it, and divide by the spectral powers of individual signals using the following equation for coherency. Coherency measures only the phase relationship and is not sensitive to amplitude which is a big drawback.

Coherency is valuable if two signals that are varying in time, stay in phase over a band of frequencies instead of a single frequency. Therefore, a band of adjacent frequancies are used in the averaging process to compute coherency:

## Bispectra

In bispectra, coherency relationship between several frequencies are used. A bispectrum shows a peak whenever (1) three frequencies , and are present in the data such that $f_1 + f_2 = f_3$ and (2) the phase relationship between the three frequencies is coherent for at least a short averaging time for a band near these frequencies. If the nonlinear processes in driving force (e.g. eccentricity or inclination of the orbit of earth) has coherent frequency triplets, then the response (i.e. climate) is likely to contain same frequency triplet. For example, is driven by eccentricity, we should be able to find eccentricity triplet. Thus, by comparing the bispectrum plot of climate proxy with the bispectrum plot of the driving forces, we can verify the influences of driving forces.

## Monte Carlo Simulation of Background

Monte carlo simulation is extremely useful to answer the questions like whether the data is properly tuned or not, whether the timescale is incorrect, whether some spectral power is being leaked to adjacent frequencies, whether the peak has real structure and also to understand the structures near the base of the peak (a shoulder) in a spectral analysis. Generally monte carlo simulation is run multiple times. For each simulation, a real signal(sinusoidal wave) is generated, then random background signal is added, then the spectral power is calculated to look for shoulders. In this way, the frequency of the shoulder occurence can be measured and the randomness can be realized. It is important to create background that behaves similarly to the background in real data. Dissimilar background will cause false conclusion. We also need to estimate the statistical significance of the peaks very carefully.

(This article is a quick review of the fourier spectral analysis from the book “Ice Ages And Astronomical Causes- (Data, Spectral Analysis and Mechanics) by Richard A. Muller and Gordon J. MacDonald“

## Diversity in the Past and Present

Well, a lot of us are fan of Khaleesi on Game of Thrones(GOT) and her fire-exhaling Dragons. It’s then a natural question 🙂 whether there existed creatures who could breathe or exhale fire. And science answers with a big “No” as there’s no evidence. Apparently the Bible thumpers and the wishful imaginative thinking of us! It feels exciting to see those dragons on GOT though!!

I am reading “The Ancestor’s Tale” by RD and within few hours, I have learnt so much about mesmerizing animals with their own features on the planet which survived, diverged, got trapped etc. The jurassic park movie did a great job on Jurassic period, but there’s so much more story in our ancestry and concestry.

I wish there were GOT like fictions based on the animals/creatures that exist, as most of us are not fans of animal planet like channels.

What is this obsession with us with non-existentence when there’s such beauty in the existence? 🤔

The Ancestor’s Tale by R.D.

https://www.goodreads.com/book/show/17977.The_Ancestor_s_Tale

## Bitcoin – how it was perceived in 2013?

A talk from 2013 at TED(Technlogy, Entertainment and Design). Watch the talk, and think what is different in the talk in 2013 from what it is now in 2018 after nearly 5 years. Enjoy!

Around the time, I first heard about bitcoin and got mind blown when I realized it could potentially give birth to a global currency. I’m lucky to find the talk for you all again, while exploring the TED talks. Even though Bitcoin was born with the dream of being a global curerncy, however, now a days, it’s mainly being treated as a digital gold or asset where you invest your money with the hope for making profit. But when I knew about Bitcoin, I didn’t get excited by it’s profit-making ability, rather, I was excited by how a currency can be established without the authority of any government by simply using some mathematics, cryptography and algorithms. The beauty of mathematics and the power of internet mesmerized me in significant way and I started learning more. Since then, it has evolved so much, it has experienced so much, it’s been popularized all around the world, criminalized in countries, stolen, hacked and what not ! So, it may not turn into a currency. Who will buy his coffee today with 0.00036 BTC(~ 3 USD) which may be 360 USD (when 1 BTC ~ 1,000,000 USD) ? It sounds foolish. But trust me I have bought my dinner in an Indian restaurant named Khana Khazana in West Lafayette with bitcoin which will be worth more than that in 2020. I don’t regret it, because I didn’t care, what will be it’s value in future, rather that I and the fellow bitcoiners were more excited to be able to use bitcoin to buy our food. So, may be, not bitcoin, but some other crypto currency will be a global coin. Governments will now create their own crypto coin(https://www.cnbc.com/2017/11/30/cryptocurrency-craze-springboards-government-backed-coin.html). But however the future will be, it’s gonna be an interesting one in terms of how we will use our money. When something is exciting, you can always smell it when you play with it, and this video just reminded me of my memories. Hope when I am old, I can say to my grandchildren that I was part of a monetary revolution.

## Oxygen-18 difference versus Number of Events during Holocene Epoch

## Fossil Record as a way to learn earth history

Who studies fossils? A Paleontologist. Why do paleontologists study fossils? Because fossil record brings information, provides clues, ideas about climatic changes of the planet, the evolution of geographical changes occurred on earth. For example, plant fossils and pollen fossils have been used to indicate climatic change of earth. To create geologic timescale, scientists have used fossils. We can see the fossil evolution during Paleogene period (over 66 Million years ago) in the image below. You can see the branches and subbranches in the image. The Subbotina trivialis (genus: Subbotina, species: trivialis) is highlighted in red.

Determining the age of fossil is very important and very challenging. Fossils are very often found in rocks and comparing one rock formation with another (relative dating), it’s possible to find a relative age for a fossil. Dating rocks involve calculating the rate of decay of radioactive element such as Carbon-14, Uranium-238, Potassium-40, Aluminium-26, Samarium-147, rubidiam-87, strontium-87. Fossilization is a rare event as there may not be any trace of an organism after its extinction. Therefore the record of an organism as the record of life in a fossil is something very significant to discover. The organism’s physical structure and subsequently deduced information such as it’s environment, diet, life-cycle can be obtained by studying fossils. Trace fossils, or fossilized marks left as a result of the activities of creatures such as trails, footprints, and burrows are also recorded and used as the source of information. From the fossil records throughout geologic time, scientists understood that the evolution of life is not a linear process. Sometimes the process is slow and sometimes it’s exponential. We also discovered that there might be periodicity in mass extinction by studying fossil records. Even the concept of plate tectonics was helped by fossil records. The more I am learning about fossils, the more exciting it’s becoming.

## An appetite for wonder

I am happy that I have accomplished something on the Christmas day. At last I could finish the memoir “An Appetite For Wonder – The Making of a Scientist” of my favorite scientist Richard Dawkins. It was such an inspiring read, especially at a time when I am suffering a lot academically and lacking a lot of enthusiasm. The first part of the book was about his childhood in africa. He had truly amazing naturalist parents. The book has very interesting description about the schools he went. His education at Oxford has had a great influence on him. The whole book has transformed him leading to a science career where he unveiled some complex ideas through his own unique style of communication. His marvelous way of seeking answers through well posed questions is really really enthusiastic. Thanks Richard. Will wait for the second part.

## Books that I have been reading

There are several books that I have been reading for quite long time. Some of them I have been reading for more than years. I don’t know what is the best strategy: reading a single book with full attention and trying to finish it first or reading several books at the same time like I am doing. How did it happen that I got in touch with several writers at the same time. The reason is probably the fact that all of the books I have been reading are long non-fiction books that you can not finish very quickly. Some part of the book require more attention and multiple reading and need you to stop and think for a while. And in every book, there are some portion that you feel bored to read. I think sometimes those boring part just made me procrastinate finishing that book and persuaded me to move on to another topic. The books that I have been reading are “The Blank Slate” by steven Pinker, “The Procrastination Equation” by Piers Steel, “An Appetite For Wonder” by Richard Dawkins. Steven Pinker is one of the best contemporary scientist and writer whom I admire very much. In the blank slate, he argues against the idea that human mind is completely malleable. I have been fascinated by several chapters of that book but I think it will still require a great amount of time to finish that book. The book that I have been enjoying for last few weeks is the one about the biography of Richard Dawkins and how he became a renowned scientist. This book is really thrilling for me. As I want to become a scientist, I can relate my life to many different aspects mentioned in the book by Dawkins with his persuasive rhetorics. I have been reading the procrastination equation as I have a life long difficulty to finish work before the deadline. That book gave me a lot of insight. But I think I will have to finish reading it very carefully. Recently I have been feeling very lazy to read this book as it is not highly entertaining but really informative. Another audio book I have been listening is “The willpower instinct” which describes how we can develop our will power. I am happy that I am in the process of learning from all these books. I wish to read more books more often. The amount of time I spend by consuming videos from youtube, Netflix should be reduced down as most of the programs I watch, watched; even though give excitement, thrill, I should admit, don’t necessarily make me a thoughtful person that I want to become.