Victory's Blog20200223T02:52:16+00:00https://vtomole.com/blogVictory Omolevtomole2@gmail.comSnowflake20200221T00:00:00+00:00https://vtomole.com//blog/2020/02/21/snowflake <p><i>By Sara Smith and Victory Omole</i></p>
<section>
<! <h4><i>Intro</i></h4> >
<p> A Squid was eating enjoying a midnight snack in her aquarium when all of a sudden; “splash!!” A can almost hit her. While looking at where the can came from, she saw an Alligator and a Liger arguing.</p>
<p>“Why did you do that?” the Squid asked, annoyed.</p>
<p>“Do what?” The Alligator asked.</p>
<p>“You threw a can at my house."</p>
<p>“First of all, I didn’t throw it. I kicked it. Second of all, who do you think you are?”</p>
<p>As soon as they were about to brawl, the Liger stepped in.</p>
<p>“Thrash didn’t mean to do that, he’s just frustrated.”</p>
<p>“Well, tell him to get this can out of my house!”</p>
<p>“I won’t do it unless you ask nicely.” Thrash sneered.</p>
<p>“Oh just do it Thrash. Don’t be difficult.” The Liger said.</p>
<p>After Thrash removed the can, The Squid said “Thanks for asking him to do that...”</p>
<p>“Snowflake. My name is Snowflake. Nice to meet you.”</p>
<p>“Nice to meet you too. I’m Queen Esmeralda.”</p>
<! <h4><i>Esmeralda's freedom</i></h4> >
<p>Every night for the next 3 weeks, Snowflake would stop by Esmeralda’s aquarium. On some nights he would be there for half an hour and sometimes he would be there for a whole night. Esmeralda was fun to hang out with. She was observant and her sense of humor was off the charts. He couldn't comprehend why Thrash refused to hang out with her. From their long talks, Esmeralda learned why Snowflake could get out of his cage. Thrash was letting him out. Thrash could leave his pen because the zookeepers didn't place a lid above his house and his tank was short enough that he was able to utilize a log located in his enclosure as a staircase to climb out.</p>
<p>While touring the zoo, he heard Snowflake weeping. Thrash inquired why Snowflake was crying and Snowflake said that he was tired of being locked up in a cage. Thrash empathetically opened Snowflake's cage with the condition that he go back in the morning so the zookeepers don't find out.</p>
<p>One night, Esmeralda started rambling about how she hates being stuck in an aquarium. She wanted to be more independent to observe other sections of the zoo. Feeling sorry for his friend, Snowflake came up with an idea: He would borrow a fish tank from the maintenance facility every night for Esmeralda to live in. He would put the fish tank on his back and take her around the zoo while the guard was not vigilant.</p>
<p>The amount joy Esmeralda was bursting with when she saw every animal in the zoo the following night made all the trouble Snowflake had to perform to retrieve fish tank worth it. </p>
<! <h4><i>Snowflake's yearning</i></h4> >
<p>The next night, Esmeralda asked: “Why does Thrash hate me?”</p>
<p>“I think he’s mad at you because the only time you guys met, you almost fought each other. He thinks you are not nice because you weren’t nice to him when you first met.“ Snowflake said.</p>
<p>“He wasn’t nice either...”</p>
<p>“I know, that’s why I’m surprised you don’t hate him like he hates you.”</p>
<p>"Why were you two arguing? What made him kick a can into my aquarium?"</p>
<p>"I want to get out of this zoo, but I don't want to do it alone because I don't want to be outside on my own. I asked Thrash to come with me and he refused. He told me how to escape but he said he wasn't going to escape with me. I got mad and said that his plan was stupid."</p>
<p>“I know how you feel. I wanted to get out of my tank but when I realized that it was impossible for me to, I learned to accept my predicament and be happy with what I have. You should be grateful that you even get to be out of your cage at all.” Esmeralda said.</p>
<p>“Didn't you dream of not just being outside of your aquarium, but outside this zoo?”</p>
<p>“No, I was too busy focusing on getting out of my aquarium. I didn't think about being outside this zoo. In fact, I don't want to. I have the perfect amount of freedom right now. I don’t need more!” Esmeralda said.</p>
<p>“But...”</p>
<p>“You know what, I don’t want to talk about this.”</p>
<! <h4><i>Snowflake's persistence</i></h4> >
<p>Snowflake did not want to stop talking about escaping because he didn't want to escape without having his friends with him. Since his friends declined on his offer, he was despondent. He stopped eating, he was tired all the time, and he was sleeping too much. Esmeralda noticed and she felt terrible about seeing her friend in such a state. She agreed to escape with him because she believed that his spirits will be lifted if they left. Snowflake lifted her spirits by getting her a fish tank. She had to return the favor. They voted to use Thrash's plan: Wait until nightfall; wait until the guard goes to the bathroom; run across the zoo to the maintenance facility; steal a ladder and use it to climb the fence.</p>
<! <h4><i>Thrash's confliction</i></h4> >
<p>Later on in the early morning, Snowflake disclosed his and Esmeralda's scheme to Thrash.</p>
<p>"Okay, well, I'm still not going with you."</p>
<p>“Please. Do it for me" Snowflake begged. If I leave, you won't have any friends here.”</p>
<p>"I can make new friends."</p>
<p>"I believe it, but it's really hard for you to make a good first impression; and that's what most people base their friendships on. Also, I'm not going to be happy out there if you're not with me."</p>
<p>"All you think about is yourself. I bet the only reason Esmeralda decided to come with you is because you guilted her into it."</p>
<p>"No, Esmeralda loves me! Dude, how about this. If you come with us, I'll literally do anything you want for the following year."</p>
<p>“Oh! Okay, fine that will work haha” Thrash said mischievously.</p>
<! <h4><i>The escape</i></h4> >
<p>The next night, Snowflake fetched the fish tank from the maintenance facility and laid Esmeralda in it. He went back for the ladder and placed it on the fence in a location the guard was least likely to visit. He ran back to Esmeralda's fish tank, placed the fish tank on his back and waited until the guard went to the bathroom. Once Thrash was over the fence, Snowflake followed along with Esmeralda. The guard noticed. Snowflake realized that someone had to distract the guard so the rest could escape. He threw the fish tank onto Thrash's back and yelled for his friends to run but Thrash refused to leave without him. Snowflake roared at the guard as a scare tactic, but because the guard thought that Snowflake was going to maul him: he shot the Liger dead. Thrash ran away immediately.</p>
<! <h4><i>The end</i></h4> >
<p>After running for a mile or so, Thrash looked around to make sure they were not being followed. He then started walking. He walked all night until he found a pond where they could take a break. After sleeping for half a day, Thrash started walking again; not caring where they were headed as long as it was away from the zoo. Knowing they were stuck with each other, they started to make amends:</p>
<p>“Remember when Snowflake stopped us from fighting” Thrash said.</p>
<p>“Yeah and remember all the jokes he told, man that dude was hilarious!” Esmeralda said.</p>
<p>After a few minutes of silence, Esmeralda said: “I didn’t want to leave the zoo. I just wanted to help Snowflake. But I'm not going to lie. It feels so good to be out here!"</p>
<p>"It really does!!" Thrash asserted.</p>
<h4>Discuss on <a href="https://www.reddit.com/user/vtomole/comments/f76a77/snowflake/">Reddit</a></h4>
</body>
</html>
<section/>The Depth of Knowledge20200125T00:00:00+00:00https://vtomole.com//blog/2020/01/25/knowledge<section>
<p><i>"Knowledge of history is stored somewhere. There is a record of everything that took place. There are forces which keep those records in a form that can be interpereted by some consciousness: so that it can understand the events that took place."
 Thomas Grothe (October 3 2019)</i></p>
<! <h3> Emergence</h3> >
<p>Knowledge’s breadth is greater that it’s depth.</p>
<p>It doesn’t take long to ask successive “whys” until you reach a limit of human knowledge. A child needs to ask you about 6 “whys" that depend on one another until you don’t know how to answer to her question.</p>
<p>“Why is the sky blue?” </p>
<p>“Because the particles in the air scatter blue light more than the other colors.”</p>
<p> “Why?”</p>
<p>“Because blue colors travel as short waves.”</p>
<p>“Why?”</p>
<p>After about 3 more “whys”, you’ll have to explain quantum theory and after that you’ll have to explain why quantum theory is the way it is; a question we don’t have an answer to. In Wikipedia, it takes at most 6 clicks to get from one article to another. In 6 degrees of separation, all people are at most 6 social connections away from each other; the “small world” observation <a href="#separation">(1).</a></p>
<p>An explanation for the observed shallowness of knowledge is Emergence; a phenomenon where a complex system has properties the parts that make it don’t have <a href="#separation">(2)</a>. Life emerges out of inanimate matter. Love is not just chemicals in your brain; it’s an emergent property of your environment and body. Sampling in Hip hop music illustrates how entirely new sounds can be created by copying pieces of preexisting sounds. A lot of complexity can be created from simple rules to a point where it’s difficult or impossible to realize which part of a rule created which part of a system. </p>
<! <h3>P VS NP and complexity theory</h3> >
<p>An abstract representation of a computer is simple. Called a Turing machine; It’s an infinitely long tape with a head that writes symbols on the tape and instructions on how to operate the machine. The complexity that can arise out of this is machine is tremendous. We can create write books, play games, socialize with people on it .. the list is endless. Since programming a Turing machine is tedious because we have to deal with every detail of the program, we define subroutines which we can use to build other subroutines which allows us to build complicated programs. This type of abstraction is how we are able to compound our knowledge because we don’t have to worry about the details that are irrelevant to the problem we are trying to solve. We couldn't make progress without abstractions because we would have to start from scratch every time.</p>
<p>Even though it's difficult to create knowledge, it's easy to verify it. It’s easy to recognize a great album or a piece of art but most people would be disoriented if they were tasked to create something similar. This is analogous to the P versus NP problem; where a computer can't solve some classes of problems efficiently but can efficiently determine that a solution is correct. The combination and thus complexity of work, skills, emotional states and other variables your favorite artist possesses that made him/her create music you like took exponentially more time than for you to appreciate that music. Emergence takes a lot of time. It's costly for complicated things to create simpler things; where the simpler things are easy to recognize.</p>
<p>Some computer scientists like Avi Wigderson believe that: “If any problem is solved by the brain, then it was an efficiently solvable problem” <a href="#avi">(3)</a>. This is misleading because it’s not a brain that solved it. It was lots of brains; and not just brains; bodies and environments took part since knowledge is not only passed through genetics, but through human language as well. The amount of knowledge that it took to solve that problem was compounding and abstracting since the beginning of the universe; and now that one brain got the right combination of everything that was designed through billions of years of evolution. Avi seems to be making a comparison of the human brain with Turing machines, but Turing machines start solving most of their problems from scratch. The programs that run human minds have been running for longer and have accumulated a lot of knowledge to be able to solve problems some classes of problems faster than computers. The spontaneous order of emergence is the cause for the sporadic instances of creativity. Once an efficient algorithm is found for a class of problems, then it is easy to solve those problems because there is a formula. It took billions of years for the first airplane to be created. Now an airplane can be built in less than a week.</p>
<! <h3>Category theory, and abstractions and reduction</h3> >
<p>One major hindrance to coming up with new ideas is different fields having different cultures which have different notations and terminology. To create ideas that are deemed “original”, one has to spend a lot of time internalizing different representations of different ideas enough to merge them to a point where the combination makes sense. The more disparate the notation for the researcher, the harder it is to cause different ideas to have a happy marriage. Since knowledge is created at an exponential rate, an effort to abstract different classes knowledge into the same language is important because it's less cognitive load for the person who’s trying to create knowledge.</p>
<p>Category theory is a general theory of functions that studies abstract relationships between functions <a href="#cat">(4)</a>. It’s recently been able to connect dissimilar areas like biology, computer science, electrical engineering, topology and chemistry under a common language. This and other attempts to get seemingly different fields under the same mathematical theory will be good for knowledge creation because it will dampen specialization; it will be easier for a biologist to take her idea and apply it to an electrical engineering problem without learning the details of circuit theory.</p>
<! <h3>Constructors</h3> >
<p>Complex systems theory; which study how different systems interact with each other is meant to address questions of emergence and abstraction. It can be formalized by the mathematics of automata. Realizing these automata in the physical world is hard because we have to deal with the laws of physics. Like the Turing machine which won’t be able to solve arbitrarily hard classes of problems efficiently, the universal constructor won’t be able to make arbitrary things efficiently. It will take decades for us to build a constructor that constructs things we want. There are many more ways to take advantage of how easily things connect with each other; of the shallowness of knowledge</p>
<p>Discuss on <a href=" https://www.reddit.com/user/vtomole/comments/etw5aw/the_depth_of_knowledge/">Reddit</a></p>
<h2>References</h2>
<p>
<a name="separation"></a> (1) <a href="https://en.wikipedia.org/wiki/Six_degrees_of_separation">Six degrees of separation</a> </p>
<p>
<a name="emergence"></a> (2) <a href="https://en.wikipedia.org/wiki/Emergence">Emergence</a> </p>
<p>
<a name="avi"></a> (3) Wigderson, Avi “<a href="http://bit.ly/nqt33x">Knowledge, creativity, and P VS NP</a>”, 2009.</p>
<p>
<a name="cat"></a> (4) <a href="https://en.wikipedia.org/wiki/Category_theory">Category theory</a> </p>
</body>
</html>
<section/>On Robots20200101T00:00:00+00:00https://vtomole.com//blog/2020/01/01/robots <section>
<p><i>"Puzzles to solve and matter to organize. I have a body in this world with muscles that have the potential to incur energy upon matter.
I could do more to reach the bounds of the world that my ego inhabits to discover what the self is."
 Thomas Grothe (September 9 2019)</i></p>
<p>Artificial general intelligence(AGI) is a machine that mimics the cognitive functions that humans associate with the human mind, such as "learning" and "problem solving". Billions are spent on artificial intelligence research a year. Even though tremendous progress has been made in the past decade, we still don't have cars that can drive themselves in any environment as was promised by experts in the field. This illustrates how formidable the task of creating AGI is. </p>
<p>A robot is a programmable computer that can perform a series of tasks. A robot is different from artificial intelligence because it's not supposed to simulate the human mind. The robots most people tend to imagine are ones that attempt to mimic the movements of the human body.</p>
<p>Mimicking the movements of the human body is easier than mimicking the human mind. Most people want an AGI because they believe these machines will be able to solve problems that human beings can't solve. Even if that's true, an AGI will have free will so it might decide that it doesn't want to solve a problem humans want to solve. That is the opposite of a tool! It might be better for mankind to hold the monopoly on knowledge creation; where we can use that knowledge to program robots to perform tasks we want rather than constructing a device that can potentially outhink us.</p>
<! <h2>3rd digital revolution</h2> >
<p>We create tools to make our lives easier. Early technological revolutions were in agriculture and manufacturing. Later revolutions have been mostly digital. The current technological revolution merges the manufacturing with digital. The outcome of the current revolution is humans using computers to design and make all physical objects. <a href="#Niel">(1)</a></p>
<p>Software is very easy to scale compared to hardware. Software can be copied on billions of devices. Copying and installing Linux to a new device is much easier than copying a Tesla factory. That is why most top ten valued companies by market capitalization are software companies. There are no manufacturing companies to be found! Even though manufacturing cannot be scaled as well as software, how close can we make it if we tried? </p>
<p> To make the game even more stacked against manufacturing, computers are universal devices due to the ChurchTuring thesis: any universal computer can do anything another universal computer can do. Analogously, imagine a factory; a universal factory that can make anything. That would be a very powerful device indeed. The process in building a universal factory is very hard compared to building a universal computer because the real world is messier than abstract mathematical objects like computable functions. Progress in using computers to design and make physical objects is being made via 3D printing, 4D printing, topological quantum computing; programmable matter in general. We don't know for sure what the first universal factory will look like. <a href="#Nielsen">(2)</a></p>
<! <h2>Universal constructors and Universal assemblers</h2> >
<p>Von Neumann's universal factory is a machine that can create any other machine that can be programmed in its cellular automaton and described in a finite but arbitrarily long tape. Human beings have built a lot of amazing objects; pyramids, spaceships, superconductors e.t.c. It follows from this that a human being without free will and creativity (a human being without a mind) is a universal constructor. From this definition, Humans are more powerful than a universal factory because a machine that can build another machine from a program doesn't need to be creative to do it's job; it doesn't need to design the machine and write a program that builds it, it just executes the program it's given. <a href="#Deutsch">(3)</a>
<p> <a href="https://www.fablabs.io/">Fab Labs</a> allow anyone to make (almost) anything. The (almost) hints that they are our next step to building a universal constructor. Assuming all the layouts of an official Fab Lab are the same, we can build robots that are specifically designed to navigate the Fab lab and place them in the lab. Robots that are as good as humans in vision smell, feel can be universal constructors because they can theoretically build anything humans can build given a program that describes what needs to be built. Humans can program the Fab lab by building something in it while The Fab Lab will record their movements. Once the object is built by humans, The Fab lab can make instructions from the recording and be able to clone the object that the human made. The only resources required from humans is the knowledge of how to build something and the raw materials required to build it.
</p>
<! <h2>Epilogue</h2> >
<p>It's a nightmare when technology goes out of control. A machine that decides it has no use for us is about as bad as it gets. Machines and humans should work together where humans come up with novel solutions to problems and machines solve those problems scalably. For this reason, it's better to invest in creating a universal constructor to augment our minds rather than an AGI to take over it.
</p>
<b>Thanks</b> to Zuriel Omole for reading drafts of this.
<p><a href=" https://www.reddit.com/user/vtomole/comments/eipxck/on_robots/">Comments</a></p>
<h2>References</h2>
<p>
<a name="Niel"></a> (1) <a href="http://cba.mit.edu/docs/papers/19.01.POW.pdf">Digital Fabrication and the Future of Work</a>, J. CutcherGershenfeld, A. Gershenfeld, and N. Gershenfeld, Perspectives on Work, Labor and Employment Relations Association, pp. 813 (2018). </p>
<p>
<a name="Nielsen"></a> (2) Nielsen, Michael A. “<a href=" http://cognitivemedium.com/vme">The varieties of material existence</a>”, 2018.</p>
<p>
<a name="Deutsch"></a> (3) Deutsch, David (2012). "Constructor Theory". <a href="https://arxiv.org/abs/1210.7439">1210.7439</a> </p>
</body>
</html>
<section/>
Fields and Quantum Error Correction20190729T00:00:00+00:00https://vtomole.com//blog/2019/07/29/field<section>
<script type="text/xmathjaxconfig">
MathJax.Hub.Config({tex2jax: {inlineMath: [['$','$'], ['\\(','\\)']]}});
</script>
<script src='https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/MathJax.js?config=TeXMMLAM_CHTML'></script>
</script>
<script src="https://cdn.rawgit.com/google/codeprettify/master/loader/run_prettify.js"></script>
$\newcommand{\ket}[1]{\left{#1}\right\rangle}$
$\newcommand\bra[1]{\langle{#1}}$
<p>A field is a set that has the following properties defined for addition and multiplication:
associativity, commutativity, distributivity, identity and an inverse.
A finite field is a field with a finite number of elements. The finite field $\mathbb{F}_2$ is
</p>
<p>$\begin{array}{ccc}
\hline
+& 0 & 1 \\ \hline
0 & 0 & 1 \\ \hline
1 & 1 & 0 \\ \hline
\times & 0 & 1\\ \hline
0& 0 & 1 \\ \hline
1& 0 & 1 \\ \hline
\end{array}$
</p>
<p>This field is universal for classical computation because addition for the set $A = \{0, 1\}$ defines
the AND gate and multiplication
for $A$ defines the XOR gate; which can be used to construct the NAND gate by hard coding a 1
on an input of the XOR gate to create the NOT gate.
</p>
<p>The Stabilizer gate set $S =\{CNOT, Hadamard, P\}$ where
P = $\ket{0}$$\bra{0}$ + $i\ket{1}$$\bra{1}$
guarantees that if an nqubit superposition is created between the basis states, those
basis states will be an affine subspace of $\mathbb{F}_{2^{n}}$ for an $n$qubit system. Taking
a one qubit system, the superpositons that can generated by the stabilizer set are
$\frac{1}{\sqrt{2}}(\ket{0} + \ket{1})$,
$\frac{1}{\sqrt{2}}(\ket{0}  \ket{1})$, $\frac{1}{\sqrt{2}}(\ket{0} + i\ket{1})$, and
$\frac{1}{\sqrt{2}}(\ket{0}  i\ket{1})$. The number of strings in the basis sets $\ket{0} + i\ket{1}$
for example, is a power of two ( $2^{1}$); same for a 2 qubit system (e.t.c $\ket{00} + \ket{11}$). Since the stabilizer gate set and states are used
to analyze quantum error correcting codes, this is an example of how finite fields are used in
<a href = "https://en.wikipedia.org/wiki/Coding_theory">Coding theory</a> and how some quantum error
correcting codes can be efficiently simulated by classical computers.
<p> Discuss on <a href = "https://github.com/vtomole/vtomole.github.io/issues/17">Github</a></p>
<h2>References</h2>
<ul>
<li><a href = "https://en.wikipedia.org/wiki/Field_(mathematics)">Field (mathematics)</a></li>
<li><a href = "https://www.scottaaronson.com/qclec/28.pdf">Lecture 28, Tues May 2: Stabilizer Formalism (PDF)</a></li>
</ul>Quantum scrambling and teleportation20190608T00:00:00+00:00https://vtomole.com//blog/2019/06/08/scrambling <section>
<script type="text/xmathjaxconfig">
MathJax.Hub.Config({tex2jax: {inlineMath: [['$','$'], ['\\(','\\)']]}});
</script>
<script src='https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/MathJax.js?config=TeXMMLAM_CHTML'></script>
</script>
<script src="https://cdn.rawgit.com/google/codeprettify/master/loader/run_prettify.js"></script>
$\newcommand{\ket}{\left{#1}\right\rangle}$
$\newcommand\bra[1]{\langle{#1}}$
$\newcommand{\norm}[1]{\left\lVert#1\right\rVert}$
<p>Quantum information scrambling is when local information is thrown into a disorder throughout a quantum system.
This phenomena is similar to quantum decoherence where the information in a system is lost to the environment.
Another way of viewing this incident is the qubits of the system entangle; and thus correlate to the qubits
in the environment. The problem is that quantum dechorence happens naturally to quantum computers since qubits can’t
be perfectly isolated from the environment. This makes an experiment that tests for quantum information scrambling
difficult because it seems like there is no way to distinguish if a system dispersed it’s information
throughout itself or the environment without resorting to quantum state tomography;
which is intractable as the size of the quantum system grows. Fortunately, there is another way to differentiate
between a quantum scrambled system and a quantum decohered one by means of quantum teleportation.</p>
<p> <a href = "https://vtomole.github.io/blog/2018/11/23/basis">Quantum teleportation </a>
is the problem of Alice needing to transport her state to Bob using classical
communication and an EPR pair shared by both parties. Unlike the standard quantum teleportation protocol where the
fidelity of sending the unknown quantum state is 100%. The protocol talked about here is probabilistic. First,
we define the manybodysystem by creating 3 EPR pairs, then we apply $U$ to half of the system and $U^\dagger$ to the other
half where $U$ (a unitary that delocalizes all single qubit operators into threequbit operators) performs the scrambling.
Since Alice's quantum state is fully scrambled in the system, Bob can sucessfully decode it by making a Bell measurement of any pair
of qubits. <p>
<p>The quantum circuit and program that defines the manybodysystem, Alice's state, the scrambling of Alice's state and
it's decoding by Bob is displayed below.
The success of this prototcol is usually around 50%.
</p>
<pre>
0: X@@H@@
   
1: H@@@H@@@HM
     
2: H@@@H@@XM

3: X@@H@@
    
4: X@@H@@
   
5: H@@@H@@

6: XM('bob')
</pre>
<pre class="prettyprint langpython">
"""To demonstrate quantum scrambling using teleportation without Grover search"""
import numpy as np
import cirq
def bell_basis_measurement(first_qubit, second_qubit):
circuit = cirq.Circuit()
circuit.append([
cirq.CNOT(first_qubit, second_qubit),
cirq.H(first_qubit),
cirq.measure(first_qubit),
cirq.measure(second_qubit),
])
return circuit
def scrambling_circuit(q0, q1, q2, q3, q4, q5):
circuit = cirq.Circuit()
circuit.append([
# Define U
cirq.CZ(q0, q2),
cirq.CZ(q0, q1),
cirq.CZ(q1, q2),
cirq.H(q0),
cirq.H(q1),
cirq.H(q2),
cirq.CZ(q0, q2),
cirq.CZ(q0, q1),
cirq.CZ(q1, q2),
# Define conjugate of U
cirq.CZ(q3, q5),
cirq.CZ(q4, q5),
cirq.CZ(q3, q4),
cirq.H(q3),
cirq.H(q4),
cirq.H(q5),
cirq.CZ(q3, q5),
cirq.CZ(q4, q5),
cirq.CZ(q3, q4),
])
return circuit
def scrambling_teleportation(circuit, qubits_to_measure):
"""
Output exmple: Fidelity 48/100
"""
repetitions = 100
# Prepare the state to be 1>
circuit.append([
cirq.X(q0),
])
# Defining manybodysystem
circuit.append([
cirq.H(q1),
cirq.H(q2),
cirq.H(q5),
cirq.CNOT(q5, q6),
cirq.CNOT(q2, q3),
cirq.CNOT(q1, q4),
])
# Retrieving U and and conjugate of U on the subsystems
scrambling_unitary = scrambling_circuit(q0, q1, q2, q3, q4, q5)
circuit.append([
scrambling_unitary,
bell_basis_measurement(qubits_to_measure[0], qubits_to_measure[1]),
cirq.measure(q6, key='bob')
])
results = cirq.Simulator().run(circuit, repetitions=repetitions)
bob_measurement = np.array(results.measurements['bob'][:, 0])
print("Fidelity {}/{}".format(sum(bob_measurement), repetitions))
if __name__ == '__main__':
q0, q1, q2, q3, q4, q5, q6 = cirq.LineQubit.range(7)
circuit = cirq.Circuit()
# Any Bell Pair measurement should give the same fidelities (aound 50%)
scrambling_teleportation(circuit, qubits_to_measure=[q1, q2])
</pre>
<p> Discuss on <a href = "https://github.com/vtomole/vtomole.github.io/issues/16">Github</a></p>
<h2>Reference</h2>
<ul>
<li><a href = "https://arxiv.org/abs/1806.02807">Verified Quantum Information Scrambling</a></li>
</ul>
<section/>
Hamiltonian simulation and Trotterization20190407T00:00:00+00:00https://vtomole.com//blog/2019/04/07/trotter <section>
<script type="text/xmathjaxconfig">
MathJax.Hub.Config({tex2jax: {inlineMath: [['$','$'], ['\\(','\\)']]}});
</script>
<script src='https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/MathJax.js?config=TeXMMLAM_CHTML'></script>
</script>
<script src="https://cdn.rawgit.com/google/codeprettify/master/loader/run_prettify.js"></script>
$\newcommand{\ket}[1]{\left{#1}\right\rangle}$
$\newcommand\bra[1]{\langle{#1}}$
$\newcommand{\norm}[1]{\left\lVert#1\right\rVert}$
<p>The evolution of a quantum system is governed by the Schrödinger equation $i \dfrac{d\ket{\psi(t)}}{dt} = H \ket{\psi(t)}$. The solution of the Schrödinger equation ($\ket{\psi(t)} = e^{iHt}\ket{0}$) is a description on what state a system will be in after a hamiltonian has been applied to it for a certain period of time. The problem of Hamiltonian simulation is thus stated as follows: Given a Hamiltonian $H$ and an evolution time $t$, output a sequence of computational gates that implement $U=e^{iHt}$. This problem is meaningful because simulating the dynamics of a quantum system is an essential problem in quantum physics and quantum chemistry. It's widely believed that the Hamiltonian simulation problem can be solved with an exponential number of gates on a classical computer while requiring only a polynomial number on a quantum computer. </p>
<p> Hamiltonians are hermitian operators that are usually a sum of a large number of individual hamiltonians $H_{j}$. For example, a Hamiltonian $H$ can be equal to $ H_{1} + H_{2}$. This sum of 2 hamiltonians can be described by the Lie product formula $e^{i(H_1+H_2)t} = \lim_{N \rightarrow \infty} (e^{iH_1t/N}e^{iH_2t/N})^N$. Since the limit of this formula is infinite, we have to truncate the series when implementing this formula on a quantum computer. The truncation introduces error in the simulation that we can bound by a maximum simulation error $\epsilon$ such that $\norm{ e^{i H t}  U} \leq \epsilon$. This truncation is known as Trotterization and it's widely used to simulate noncommuting hamiltonians on quantum computers. The Trotterization formula is then $e^{iHt} = (e^{iH_{0}t/r} * e^{iH_{1}t/r ....* e^{iH_{d1}}t/r })^r$ + $0$(some polynomial factors).</p>
<p>Suppose we want to simulate $H = X_{0} + Y_{1} + Z_{2}$ where X, Y and Z are Pauli matrices and the subscripts label the qubits that the hamiltonians apply to. We can't simulate each Hamiltonian separately because they don't commute. This is why we use Trotterization where we evolve the whole hamiltonian by repeatedly switching between evolving $X_{0}$, $Y_{1}$, and $Z_{2}$ each for a small period of time. The first step in deriving the circuit that will simulate this hamiltonian is finding the quantum gates that implement each of it's individual term. This case is simple since the quantum gates $R_{x}(\theta) = e^{i \dfrac{\theta}{2}X}$, $R_{y}(\theta) = e^{i \dfrac{\theta}{2}Y}$, $R_{z}(\theta) = e^{i \dfrac{\theta}{2}Z}$ will implement the individual terms perfectly. $\theta$ specifies the angle by which to rotate the state in a specified axis; this is sort of synonymous with time since it specifies how "long" to apply the hamiltonian to the qubit ($\pi/2$ degrees could take 30 nanoseconds to rotate to, $\pi$ degress could take 60 nanoseconds, e.t.c). Since we don't care too much about the simulation error in this example, we will use $r = 2$ and $t=1$ in our Trotter formula to get $e^{X_{0} + Y_{1} + Z_{2}} = (e^{iX_{0}} * e^{iY_{1}} * e^{iZ_{2}}) * (e^{iX_{0}/2} * e^{iY_{1}/2} * e^{iZ_{2}/2}) $ <p>
<p> Discuss on <a href = "https://github.com/vtomole/vtomole.github.io/issues/1">Github</a></p>
<h2>References</h2>
<ul>
<li><a href = "https://arxiv.org/abs/1904.01131">Q# and NWChem: Tools for Scalable Quantum Chemistry on Quantum Computers</a></li>
</ul>
<section/>
Quantum Teleportation and Basis Measurements20181123T00:00:00+00:00https://vtomole.com//blog/2018/11/23/basis <section>
<script type="text/xmathjaxconfig">
MathJax.Hub.Config({tex2jax: {inlineMath: [['$','$'], ['\\(','\\)']]}});
</script>
<script src='https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/MathJax.js?config=TeXMMLAM_CHTML'></script>
</script>
<script src="https://cdn.rawgit.com/google/codeprettify/master/loader/run_prettify.js"></script>
$\newcommand{\ket}[1]{\left{#1}\right\rangle}$
$\newcommand\bra[1]{\langle{#1}}$
<p>It's easy to be confused on what "measuring on a basis" means; especially if one is not familiar with Linear Algebra. Measuring on a basis is measuring the state $\ket{\psi}$ = $\alpha \ket{\upsilon }$ + $\beta \ket{\omega }$ where $\ket{\psi}$ is expressed in terms of the basis $\left\{ \ket{\upsilon }, \ket{\omega } \right\}$. Measurements are represented by a set of operators $P_i$ satisfying $\sum_{i}^{} P_i = \mathbb{I}$. This means that measurement on a basis $\left\{ \ket{\upsilon }, \ket{\omega } \right\}$ is defined as the operators $P_{\upsilon}$ = $\ket{\upsilon} \bra{\upsilon}$, $P_{\omega}$ = $\ket{\omega} \bra{\omega}$. When $\ket{\psi}$ is measured, it's outcome is $i$ with probability $p_i$ = $\bra{\psi} P_i \ket{\psi}$. Measurement in the standard basis $\left\{ \ket{0}, \ket{1} \right\}$ is then $P_0$ = $\ket{0} \bra{0}$, $P_1$ = $\ket{1} \bra{1}$. This makes $p_0$ = ($\alpha \bra{0 }$ + $\beta \bra{1}$) * ($\ket{0} \bra{0}$) * ($\alpha \ket{0 }$ + $\beta \ket{1}$) = ${\mid {\alpha} \mid}^2$. </p>
<p>Measurements can be done on any orthornormal basis where $\ket{\upsilon }$ is measured with probability ${\mid {\alpha} \mid}^2$ and $\ket{\omega }$ is measured with probability ${\mid {\beta} \mid}^2$. Since qubits have the states $\left\{ \ket{0}, \ket{1} \right\}$, a method for performing measurements in a basis through the quantum circuit model where $\ket{0}$ represents $\ket{\upsilon}$ and $\ket{1}$ represents $\ket{\omega}$ is done with a unitary gate corresponding to the appropriate basis change followed by a measurement operator in the standard basis. <p>
<p> In the quantum teleportation protocol, two parties Alice and Bob share an EPR pair $\frac{1}{\sqrt{2}} \ket{00} + \frac{1}{\sqrt{2}} \ket{11} $. If these parties are separated, the measurements on their respective qubits will be correlated. If Alice has another qubit in the state $\ket{\psi}$ = $\alpha \ket{0}$ + $\beta \ket{1}$ and she wants to give it to Bob, she can do it using her half of the EPR pair and classical communication. Alice will perform a Bell Basis measurement on her EPR qubit with the quantum state she wants to transmit to Bob. Remember, a quantum gate followed by a measurement in the computational basis is a measurement in a different basis; so a Bell Basis measurement will be a CNOT gate from the unknown quantum state to Alice's EPR pair followed by a Hadamard gate to the unknown quantum state. This is a Bell Basis measurement because the Bell state is prepared by applying a Hadamard gate on the first qubit followed by a CNOT gate from the first to the second qubit. Alice will then send the results of her Bell Basis measurements to Bob via 2 bits. Bob will perform operations on his half of the EPR pair based on the bits he receives from Alice to get the state that Alice wants to teleport to him. If he recieves $\ket{00}$, he will apply the identity gate. If he receives $\ket{01}$, he will apply the X gate, if he receives $\ket{10}$, he will apply the Z gate, and if he recieves $\ket{11}$, he will apply both the X and Z gates. </p>
<p> If Alice wants to teleport $\ket{1}$ to Bob, she can do so with the following procedure: </p>
<pre>
(0, 0):X@HM('s')@
 
(0, 1): H@XM('a')@
  
(0, 2): XX @ M('b')
</pre>
The following is a program that represents the corresponding circuit:
<pre class="prettyprint langpython">
"""To demonstrate teleportation on a NISQ machine """
import numpy as np
import cirq
def main():
"""
Output: The result of measuring Bob's qubit [True]
"""
state = cirq.GridQubit(0, 0)
alice = cirq.GridQubit(0, 1)
bob = cirq.GridQubit(0, 2)
circuit = cirq.Circuit()
# Prepare the state to be 1>
circuit.append([
cirq.X(state),
])
# Prepare shared entangled state.
circuit.append([
cirq.H(alice),
cirq.CNOT(alice, bob),
])
# Bell Basis measurement on state and Alice's qubit
circuit.append([
cirq.CNOT(state, alice),
cirq.H(state),
cirq.measure(state, key='s'),
cirq.measure(alice, key='a'),
])
# Classical communication. This simulates Bob applying
# operations on his qubit based on the bits he receives from Alice
circuit.append([cirq.CNOT(alice, bob),
cirq.CZ(state, bob),
])
circuit.append([cirq.measure(bob, key='b')])
result = cirq.google.XmonSimulator().run(circuit)
bob_measurement = np.array(result.measurements['b'][:, 0])
print("The result of measuring Bob's qubit ", bob_measurement)
if __name__ == '__main__':
main()
</pre>
</p>
<section/>
<p> Discuss on <a href = "https://github.com/vtomole/vtomole.github.io/issues/15">Github</a></p>
<section/>
<h2>References</h2>
<ul>
<li><a href = "https://www.cs.cmu.edu/~odonnell/quantum15/lecture03.pdf">Lecture 3: The Power of Entanglement</a></li>
</ul>
<section/>
Quantum Phase Estimation and Hadamard test20180520T00:00:00+00:00https://vtomole.com//blog/2018/05/20/pea <section>
<h1>Introduction</h1>
<p>Quantum Phase estimation is an important subroutine in quantum computing. It’s used for factoring, quantum simulation, and discrete logarithm. A version of Phase estimation called the Hadamard test is used to approximate the Jones polynomial (a significant expression in <a href = "https://en.wikipedia.org/wiki/Knot_theory">Knot theory</a>). Quantum computers can estimate phases more efficiently than classical computers because quantum computers can exploit the Quantum Fourier transform; which requires $O(n^2)$ gates instead of $O(n*2^n)$ on classical computers. This post will cover Phase Estimation and Hadamard test including their implementations on a quantum computer.</p>
<section/>
<section/>
<h2>Quantum Phase estimation</h2>
$\newcommand{\ket}[1]{\left{#1}\right\rangle}$
$\newcommand\bra[1]{\langle{#1}}$
<p>Quantum Phase estimation solves: Given a Unitary $\textit{U}$ and an eigenvector of $\textit{U}$ $\ket{\psi}$ such that $U\ket{\psi}$ = $e^{2 \pi i \theta}$ where $0 \leq \theta < 1$, find the eigenvalue $e^{2 \pi i \theta}$ of $\ket{\psi}$ or the phase $\theta$. </p>
<p> This algorithm uses 2 registers. The first register contains $m$ qubits. The more $m$ qubits, the more accurate $\theta$'s estimation will be. The second register contains $\ket{\psi}$. Quantum Phase estimation first prepares the state $\ket{0}_{m} \ket{\psi}$ by intilizing the first $m$ qubits to $\ket{0}$ and encoding $\ket{\psi}$ in the second register. Hadamard gates are applied to each qubit in the first register</p>
\[ \ket{\varphi_{0}} = H^{\otimes m} [\ket{0}] \ket{\psi} = \frac{1}{\sqrt{2^m}}[(\ket{0} + \ket{1})_{0} \, (\ket{0} + \ket{1})_{1} \, (\ket{0} + \ket{1})_{2}\, ... (\ket{0} + \ket{1})_{2^m1}] \ket{\psi} \]
<p>$2^{m1}$ controlledU (cU) gates are applied to the second register, remember that $U \ket{\psi}$ = $e^{2 \pi i \theta}$</p>
\[ \ket{\varphi_{1}} = cU^{2m1} \ket{\varphi_{0}} = \frac{1}{\sqrt{2^m}}[(\ket{0} + e^{2 \pi i \theta 2^{0}} \ket{1})_{0} \, (\ket{0} + e^{2 \pi i \theta} 2^{1} \ket{1})_{1} \, (\ket{0} + e^{2 \pi i \theta} 2^{2} \ket{1})_{2}\, ... (\ket{0} + e^{2 \pi i \theta 2^{m1}} \ket{1})_{2^m1}] \ket{\psi} \]
<p>This equation can be written as</p>
\[ \ket{\varphi_{1}} = \frac{1}{\sqrt{2^m}} [ \sum_{k=0}^{ 2^{m1}} e^{2 \pi i \theta k} \ket{k}] \ket{\psi} \]
<p>Which is similiar to the Quantum Fourier transform</p>
\[\text{QFT} \ket{x} = \frac{1}{\sqrt{2^m}} \sum_{k=0}^{ 2^{m1}} e^{\frac{2 \pi i x k}{2^n}} \ket{k} \]
<p>The first register of $\ket{\varphi_{1}}$ is similar to QFT $\ket{x}$ (QFT $\ket{\theta}$). To get $\ket{\theta}$, the inverse of the Quantum Fourier transform (QFT in reverse) is applied to the first register.</p>
\begin{align}\text{QFT}^{1} \text{QFT} \ket{\theta} = \text{QFT}^{1} \frac{1}{\sqrt{2^m}} \sum_{k=0}^{ 2^{m1}} e^{2 \pi i \theta k} \ket{k} = \ket{\theta_{0} \, \theta_{1} \, \theta_{2} ... \, \theta_{m} } \end{align}
<p>The state of the second register doesn't change during computation, so the final state of the system before measurement is $\ket{\theta_{0} \, \theta_{1} \, \theta_{2} ... \, \theta_{m} } \ket{\psi} $. Measurement of the first register will result in an approximation of $\theta$.</p>
<h3>Quantum Phase estimation example</h3>
<p>Given a unitary
$U = \begin{bmatrix}
0 & 1 \\
1 & 0 \\
\end{bmatrix} $, and an eigenvector of $U$ (1, 1), with an eigenvalue of $\lambda = e^{2 \pi i \theta}$, find the phase ($\theta$). A calculation for the eigenvalues of U gives $\lambda_{1} = 1$ and $\lambda_{2} = 1$. So $\theta_{1} = 0$ and $\theta_{2} = \frac{1}{2}$. The phases can be calculated on a quantum computer where $\theta_{1}$ corresponds to the measured result $0$ and $\theta_{2}$ to $1$ because it's the only other option (using more qubits to estimate $\frac{1}{2}$ is not neccesary in this case).</p>
<p>The eigenvalue can be approximated wih a single bit, so the first register will contain a qubit. The second register will also contain one qubit because $U$ is a one qubit gate (i.e X gate). To encode the eigenvector (1,1) into the second register, it has to be nomalized to $(\frac{1}{\sqrt{2}},\frac{1}{\sqrt{2}})$. The Hadamard gate is applied to the first qubit resulting in $\frac{1}{\sqrt{2}}[(\ket{0} + \ket{1}$. The normalized eigenvector $\ket{\psi}$ is encoded using the Hadamard gate. The state of the system is </p>
\[ \frac{1}{\sqrt{2}}(\ket{0} + \ket{1}) \frac{1}{\sqrt{2}}(\ket{0} + \ket{1}) = \frac{1}{\sqrt{2}} (\ket{0} \ket{0} + \ket{1} \ket{1}) \]
<p>Applying the cU gate</p>
\[cU [\frac{1}{\sqrt{2}} (\ket{0} \ket{0} + \ket{1} \ket{1})] = \frac{1}{\sqrt{2}} (\ket{0} \ket{0} + U \ket{1} \ket{1}) = \frac{1}{\sqrt{2}}(\ket{0} + U \ket{1}) \frac{1}{\sqrt{2}}(\ket{0} + \ket{1}) = \frac{1}{\sqrt{2}}(\ket{0} + e^{2 \pi i 0} \ket{1}) \frac{1}{\sqrt{2}}(\ket{0} + \ket{1}) \]
<p>The inverse QFT of this equation is calculated by applying the Hadamard gate to the first qubit.</p>
\[ H [\frac{1}{\sqrt{2}}(\ket{0} + e^{2 \pi i 0} \ket{1})] \frac{1}{\sqrt{2}}(\ket{0} + \ket{1}) = \ket{0} \frac{1}{\sqrt{2}}(\ket{0} + \ket{1}) \]
<p>Measuring the first qubit gives $0$ as predicted by theory.</p>
<h3>Phase estimation implementation on a Quantum computer</h3>
<pre>
<code>
phase_estimation_0 = Program(
H(0),
H(1),
CNOT(0, 1), #CNOT is another name for cX which is cU in this example
H(0),
)
print("Expected answer is 0")
result = qvm.run_and_measure(phase_estimation_0, [0], 1)
print(result)
</code>
</pre>
<p>To estimate $\theta_{2}$, another eigenvector of $U$ (1, 1) normalized to $(\frac{1}{\sqrt{2}},\frac{1}{\sqrt{2}})$ will be loaded into the second register. This is accomplished by the the Hadamard gate followed by the Z gate</p>
\[ZH\ket{0} = \frac{1}{\sqrt{2}}(\ket{0} + \ket{1})\]
<p>The remaining instructions for this program will be the same as the previous one.Quick calculations will show that the first register contains $\ket{\frac{1}{2}}$.</p>
<pre>
<code>
phase_estimation_1 = Program(
H(0),
Z(0),
H(1),
CNOT(0, 1), #CNOT is another name for CX which is CU in this example
H(0),
)
print("Expected answer is 1")
result = qvm.run_and_measure(phase_estimation_1, [0], 1)
print(result)
</code>
</pre>
<p>These programs are implemented <a href = "https://github.com/QCHackers/qchackers/blob/master/pyquil/phase_estimation.py">here</a>.</p>
</section>
<h2>Hadamard Test</h2>
<p>The Hadamard test solves: Given a Unitary $U$ and a state $\ket{\psi}$, estimate $\bra{\psi}  U \ket{\psi}$. It prepares the state $\frac{1}{\sqrt{2}}(\ket{0} + \ket{1}) \ket{\psi}$ by appplying the hadamard gate to the first qubit, applies cU from $\frac{1}{\sqrt{2}}(\ket{0} + \ket{1})$ to $\ket{\psi} $ to get $\frac{1}{\sqrt{2}}(\ket{0} \ket{\psi} + U \ket{1} \ket{\psi})$. After the Hadamard gate is applied to the first qubit again, the probability of measuring $\ket{0}$ is \[\frac{1}{2} (1 + \text{Re} \bra{\psi} U \ket{\psi}) \] The probability of measuring $\ket{1}$ is
\[\frac{1}{2} (1  \text{Re} \bra{\psi} U \ket{\psi})\]
The result we want to approximate is calculated by subtracting probabilities
\[ \text{Re} \bra{\psi} U \ket{\psi} = \frac{1}{2} (1 + \text{Re} \bra{\psi} U \ket{\psi})  \frac{1}{2} (1  \text{Re} \bra{\psi} U \ket{\psi}) \]
where $\text{Re} \bra{\psi} U \ket{\psi}$ is the real component of $\bra{\psi} U \ket{\psi}$. $\text{Im} \bra{\psi}  U \ket{\psi}$ is estimated by first preparing the state $\frac{1}{\sqrt{2}}(\ket{0} + i \ket{1})$ instead of $\frac{1}{\sqrt{2}}(\ket{0} + \ket{1})$. This is prepared by applying the Hadamard gate followed by the S gate. The steps involved in deriving expected values are <a href = "https://en.wikipedia.org/wiki/Hadamard_test_(quantum_computation)">here</a> </p>
<h3>Hadamard test example</h3>
<p>Given a unitary
$U = \begin{bmatrix}
0 & 1 \\
1 & 0 \\
\end{bmatrix} $, and a state $\ket{1}$, estimate $\bra{1}  U \ket{1}$. $\bra{1} U \ket{1}$ = $\bra{1}\ket{0} = 0$.</p>
<p> Encoding $\ket{1}$ is fulfilled by applying the X gate on the second qubit.
$\ket{0} \text{X}(\ket{0}) = \ket{0} \ket{1}$. The Hadamard gate is applied to the first qubit $H (\ket{0}) \ket{1}$ = $\frac{1}{\sqrt{2}}(\ket{0} + \ket{1}) \ket{1}$. Applying CU to the system
\[CU [\frac{1}{\sqrt{2}}(\ket{0} + \ket{1}) \ket{1}] = \frac{1}{\sqrt{2}}(\ket{0} \ket{1} + U \ket{1} \ket{1}) = \frac{1}{\sqrt{2}}(\ket{0} + U \ket{1}) \ket{1} = \frac{1}{\sqrt{2}}\ket{00} \frac{1}{\sqrt{2}}\ket{11} \]
Applying Hadamard to the first qubit results in:
\[ H (\frac{1}{\sqrt{2}}\ket{00} \frac{1}{\sqrt{2}}\ket{11}) = \frac{1}{2}\ket{00}  \frac{1}{2}\ket{01} + \frac{1}{2}\ket{10} + \frac{1}{2}\ket{11} \]
This means that there is a 0.5 probability qubit 1 is 0 and 0.5 that it's 1 upon measurement. 0.5  0.5 = 0, the expected value.
<h3>Hadamard test implementation on a Quantum computer</h3>
<pre>
<code>
hadamard_test0 = Program(
X(0),
H(0),
CNOT(0, 1),
H(0),
)
print("Expected value will be closer to 0 than 100 or 100")
result = qvm.run_and_measure(hadamard_test0, [0], 100)
print(get_Re(result))
</code>
</pre>
<p>The Hadamard test is probabilistic. It will give an expectation value that is close to 0 as is predicted by theory.</p>
<p>What if $\ket{\psi}$ is $\frac{1}{\sqrt{2}}(\ket{0} + \ket{1})$ instead of $\ket{0}$? $\bra{\frac{1}{\sqrt{2}}(\ket{0} + \ket{1})} U \ket{\frac{1}{\sqrt{2}}(\ket{0} + \ket{1})} = \bra{\frac{1}{\sqrt{2}}(\ket{0} + \ket{1})} \ket{\frac{1}{\sqrt{2}}(\ket{0} + \ket{1})} = 1 $. Executing this program on a quantum computer shows that the answer is 1.</p>
<pre>
<code>
hadamard_test1 = Program(
H(1),
H(0),
CNOT(0, 1),
H(0),
)
print("Expected value will 1")
result = qvm.run_and_measure(hadamard_test1, [0], 1)
print(get_Re(result))
</code>
</pre>
<p>Implementations of these Hadamard tests are <a href = "https://github.com/QCHackers/qchackers/blob/master/pyquil/hadamard_test.py">here</a></p>
</section>
<section>
<h2>Conclusion</h2>
<p>Finding the eigenvalues of an operator is an important problem in Linear algebra. Quantum Phase Estimation and it's cousin the Hadamard test are important subroutines for quantum algorithms that leverage the Quantum fourier transform to perform operations more efficiently on quantum computers than classical computers. Phase estimation can be used in quantum simulation and factoring. The Hadamard test is used to solve the Jones polynomial; which is important in knot and topological quantum field theories. This post covered examples of how the Hadamard test and Phase Estimation algorithm work.</p>
<section/>
<p> Discuss on <a href = "https://github.com/vtomole/vtomole.github.io/issues/14">Github</a></p>
<section>
<h2>References</h2>
<ul>
<li><a href = "https://en.wikipedia.org/wiki/Hadamard_test_(quantum_computation)">Wikipedia entry on Hadamard Test</a></li>
<li><a href = "https://en.wikipedia.org/wiki/Quantum_phase_estimation_algorithm">Wikipedia entry on Quantum phase estimation</a></li>
<li><a href = "https://en.wikipedia.org/wiki/Quantum_Fourier_transform">Wikipedia entry on Quantum Fourier transform</a></li>
</ul>
<section/>
Quantum programming at Hack ISU20180329T00:00:00+00:00https://vtomole.com//blog/2018/03/29/qchackisu<p><strong>Introduction</strong></p>
<p>Qchackers built an <a href="http://qchackers.com">SDK for quantum computers</a> at Hack ISU last weekend. Evan Anderson, Dylan Sharp and I created a compiler and quantum virtual machine (QVM); which were used to demonstrated quantum teleportation between two QVMs. We received the <a href="https://twitter.com/MLHacks/status/977959734467858433">AWS Education prize</a>.</p>
<p><strong>Compiler and QVM</strong></p>
<p>The compiler was written in Common Lisp. I tried to write the compiler in Python, but Python did not have all the features I needed; like generating and executing code on the fly. Even though Common Lisp made deploying our SDK to the cloud difficult, it was a good trade off because a Common Lisp implementation of the compiler is easier to extend than a Python implementation.</p>
<p>The QVM was implemented in Python with Numpy. Implementing the virtual machine was straightforward. The most difficult part was debugging. Our virtual machine executes basic single and two qubit gates. It also performs measurements on the qubits. Measurement results can be stored in a classical register via a classical instruction.</p>
<p><strong>Quantum teleportation</strong></p>
<p>Hack ISU was the first time Dylan studied quantum computation. His task was to grok quantum teleportation and figure out how to implement it across two QVMs. Dylan went through all the highs and lows of studying quantum computation in a relatively short period of time. He went from reviewing linear algebra to understanding quantum teleportation in 24 hours. Dylan corresponded with Evan in implementing quantum teleportation on the QVMs.</p>
<p><strong>Conclusion</strong></p>
<p>Hack ISU Spring 2018 was my most productive hackathons. I was lucky to work with someone who knew quantum computing, and another who was eager learn. Without Dylan’s newfound knowledge of quantum teleportation and Evan’s expertise in Python programming and web development, we wouldn’t have accomplished as much as we did. I would definitely hack with this group of Qchackers again!</p>
<p>Discuss on <a href="https://github.com/vtomole/vtomole.github.io/issues/13">Github</a></p>
Quantum entanglement and Superdense coding20180303T00:00:00+00:00https://vtomole.com//blog/2018/03/03/sd <section>
<h1>Introduction</h1>
<p>Quantum entanglement is a powerful physical phenomenon where quantum systems interact in a way that they can't be described independently of each other. This anomaly is useful for sending classical information. Superdense coding is a procedure where quantum superposition and entanglement is used to encode two bits of information in one quantum bit. This blog post covers the basics of quantum systems and the mathematical methods used to describe and manipulate them. This post also shows how superdense coding is used to encode and decode bits 01, followed by the execution of a superdense code on a Quantum Virtual Machine (QVM).</p>
<section/>
<section>
$\newcommand{\ket}[1]{\left{#1}\right\rangle}$
<h2>Simple Quantum Systems</h2>
<p>The pure state of a twolevel quantum system (qubit) is represented by a $2\times1$ matrix
\begin{align}
\ket{\psi} = \begin{bmatrix}
\alpha \\
\beta \\
\end{bmatrix} = \alpha \ket{0} + \beta \ket{1}
\end{align}
Where $\alpha$ and $\beta$ are complex numbers that represent probability amplitudes and $${\alpha}^2 + {\beta}^2 =1$$
The spin of an electron is a twolevel quantum system where
\begin{align}
\ket{0} = \begin{bmatrix}
1 \\
0 \\
\end{bmatrix}
\end{align}
is spin down and
\begin{align}
\ket{1} = \begin{bmatrix}
0 \\
1 \\
\end{bmatrix}
\end{align}
is spin up.$\\$</p>
<section/>
<section>
<h2>Operators</h2>
<p>The state of a quantum system is modfied by applying an operator $\textit{U}$ to it. $\textit{U}$ could be a PauliX matrix,
\begin{align}
X &= \begin{bmatrix}
0 & 1 \\
1 & 0 \\
\end{bmatrix}
\end{align}
or a PauliZ matrix.
\begin{align}
Z &= \begin{bmatrix}
1 & 0 \\
0 & 1 \\
\end{bmatrix}
\end{align}
It can also be an identity matrix, whose application does not change the system's state.
\begin{align}
I &= \begin{bmatrix}
1 & 0 \\
0 & 1 \\
\end{bmatrix}
\end{align}
Operator application is realized by computing the dot product of the operator and the state. The PauliX operator modifies the spin of an electron from up to down and vice versa.
\begin{align}
\begin{bmatrix}
0 & 1 \\
1 & 0 \\
\end{bmatrix} . \begin{bmatrix}
1 \\
0 \\
\end{bmatrix}
= \begin{bmatrix}
0 \\
1 \\
\end{bmatrix}
\end{align}
The Hadamard operator
\begin{align}
H &= \frac{1}{\sqrt{2}} \begin{bmatrix}
1 & 1 \\
1 & 1 \\
\end{bmatrix}
\end{align}
puts an electron in an equal superposition of spin up and spin down.
\begin{align}
\frac{1}{\sqrt{2}} \begin{bmatrix}
1 & 1 \\
1 & 1 \\
\end{bmatrix} .
\begin{bmatrix}
1 \\
0 \\
\end{bmatrix}
= \begin{bmatrix}
\frac{1}{\sqrt{2}} \\
\frac{1}{\sqrt{2}} \\
\end{bmatrix} = \frac{1}{\sqrt{2}} \ket{0} + \frac{1}{\sqrt{2}} \ket{1}
\end{align}
When this state is measured, there is a ${\frac{1}{\sqrt{2}}}^2 = 0.5$ chance that it will be $\ket{0}$ and ${\frac{1}{\sqrt{2}}}^2 = 0.5$ chance that it will be $\ket{1}.$</p>
</section>
<section>
<h2>Quantum Entanglement</h2>
<p>The ControlledNOT operator is applied to entangled systems. It flips the state of the second qubit if the state of the first qubit is 1 and leaves the second qubit unchanged if the first qubit is 0.
\begin{align}
CNOT &= \begin{bmatrix}
1 & 0 & 0 & 0 \\
0 & 1 & 0 & 0 \\
0 & 0 & 0 & 1 \\
0 & 0 & 1 & 0
\end{bmatrix}
\end{align}
The tensor product transforms a 2level system into a 4level system before applying the CNOT operator.
\begin{align}
A \otimes B = \begin{bmatrix}
A_{11} \begin{bmatrix}
b_{11} & b_{12} \\
b_{21} & b_{22} \\
\end{bmatrix} & A_{12} \begin{bmatrix}
b_{11} & b_{12} \\
b_{21} & b_{22} \\
\end{bmatrix} \\
A_{21} \begin{bmatrix}
b_{11} & b_{12} \\
b_{21} & b_{22} \\
\end{bmatrix} & A_{22} \begin{bmatrix}
b_{11} & b_{12} \\
b_{21} & b_{22} \\
\end{bmatrix}\\
\end{bmatrix}
\end{align}
Entangling $\ket{1}$ with $\ket{0}$ and applying CNOT on the resulting state will flip the state of the second qubit.
\begin{align}
\begin{bmatrix}
0 \\
1 \\
\end{bmatrix} \otimes \begin{bmatrix}
1 \\
0 \\
\end{bmatrix} = \begin{bmatrix}
0 \\
0 \\
1 \\
0\\
\end{bmatrix} = \ket{10}
\end{align}
\begin{align}
CNOT\ket{10} = \begin{bmatrix}
1 & 0 & 0 & 0 \\
0 & 1 & 0 & 0 \\
0 & 0 & 0 & 1 \\
0 & 0 & 1 & 0
\end{bmatrix}. \begin{bmatrix}
0 \\
0 \\
1 \\
0\\
\end{bmatrix} = \begin{bmatrix}
0 \\
0 \\
0 \\
1\\
\end{bmatrix} = \ket{11}
\end{align}</p>
</section>
<section>
<h2>Superdense coding</h2>
<p>Two bits of information can be encoded into one qubit. Superdense coding is performed by entangling two qubits and encoding two bits by applying an operator on one qubit. Entangled qubits are sent to the receiver who uses a decoder to read the bits that were encoded.
Suppose the sender wants to send bits <i>01</i>. He will transform the first qubit into a superposition of states.
\begin{align}
\ket{\psi} = H(\ket{0})= \frac{1}{\sqrt{2}} \ket{0} + \frac{1}{\sqrt{2}} \ket{1} \tag{0}
\end{align}
The sender will entangle this qubit with a second qubit that is spin down.
\begin{align}
\ket{\psi} = CNOT(\frac{1}{\sqrt{2}} \ket{0} + \frac{1}{\sqrt{2}} \ket{1} \otimes \ket{0}) = \frac{1}{\sqrt{2}} \ket{00} + \frac{1}{\sqrt{2}} \ket{11}\tag{1}
\end{align}
The X operator is applied on the first qubit.
\begin{align}
\ket{\psi} = X \otimes I (\frac{1}{\sqrt{2}} \ket{00} + \frac{1}{\sqrt{2}} \ket{11}) = \frac{1}{\sqrt{2}} \ket{01} + \frac{1}{\sqrt{2}} \ket{10}\tag{2}
\end{align}
The reciever will use the first and second qubit to decode <i>01</i> that was encoded in the first qubit.
\begin{align}
\ket{\psi} = CNOT(\frac{1}{\sqrt{2}} \ket{01} + \frac{1}{\sqrt{2}} \ket{10}) = \frac{1}{\sqrt{2}} \ket{01} + \frac{1}{\sqrt{2}} \ket{11}\tag{3}
\end{align}
\begin{align}
\ket{\psi} = H \otimes I (\frac{1}{\sqrt{2}} \ket{01} + \frac{1}{\sqrt{2}} \ket{11}\tag{4})= \ket{01}
\end{align}
The X operator is removed to encode <i>00</i>. The Z operator is applied in place of X to encode <i>10</i>. <i>11</i>
is encoded with Z and X.</p>
</section>
<section>
<h2>Simulation of superdense coding</h2>
<p>It's not difficult to write a program simulates superdense coding. A program that consists of instructions we can run on a quantum computer is composed of commands where each line contains an operator and a qubit it's applied to. "H 1" means "Apply the Hadamard operator to qubit 1". A program that encodes and decodes bits <i>01</i> could be</p>
<pre>
<code>
H 0
CNOT 0 1
X 0
CNOT 0 1
H 0
MEASURE 0
MEASURE 1
</code>
</pre>
<p>Since we don't have access to a quantum computer, we can run this program on a <a href ="https://github.com/vtomole/qchackers/blob/master/software/eagle/vm.py"> virtual machine</a> that simulates a quantum computer.
We paste the <i>01</i> coding program into a <a href = " https://github.com/vtomole/qchackers/blob/master/software/eagle/sd_coding.eg"> file</a> and execute it on a virtual machine.</p>
<pre>
<code>
python vm.py python vm.py sd_coding.eg
MEASUREMENT of qubit 0 is 0
MEASUREMENT of qubit 1 is 1
</code>
</pre>
</section>
<section>
<h2>Conclusion</h2>
<p>Superdense coding is a glimpse into the possibilities of quantum information. The principle of superposition and entanglement is used to encode two bits of information in one qubit. This technique can be executed on a quantum computer or a QVM.</p>
<section/>
<p> Discuss on <a href = "https://github.com/vtomole/vtomole.github.io/issues/12">Github</a></p>
<section>
<h2>References</h2>
<ul>
<li><a href = "https://en.wikipedia.org/wiki/Superdense_coding">Wikipedia entry on Superdense coding</a></li>
<li><a href = "https://en.wikipedia.org/wiki/Qubit">Wikipedia entry on Qubits</a></li>
<li><a href = "https://en.wikipedia.org/wiki/Tensor_product">Wikipedia entry on the Tensor product</a></li>
<li><a href = "https://en.wikipedia.org/wiki/Quantum_gate">Wikipedia entry on Quantum gates</a></li>
</ul>
<section/>
My 15 minute presentation on Quantum computing20180202T00:00:00+00:00https://vtomole.com//blog/2018/02/02/qcintrofan<p>I did an <a href="https://vtomole.github.io/static/qcintrofan.pdf">Introduction to Quantum Computing</a> talk for the Friday Activities at Noon (FAN) Club today. I feel like I covered important concepts on how quantum computers work and how they are built.</p>
<p>I did not emphasize the difference between nearterm and fault tolerant quantum computers; so there might have been a couple of people in the audience who thought that the quantum devices of the next 510 years will be able to factorize large numbers or search unsorted databases efficiently.</p>
<p>If I had more time to present, I would have taken the opportunity to talk about the nuances of quantum computers; like the difference between physical and logical qubits. I would have also explained how simple quantum algorithms like DeutschJozsa and Grover’s algorithms work. I did have the opportunity to demonstrate how quantum algorithms are programmed (I demoed a Bell State experiment using <a href="https://github.com/rigetticomputing/pyquil">pyQuil</a>).</p>
<p>Overall, I feel like the presentation went extremely well. I would redo this talk if I had 15 minutes to explain quantum computing to Undergraduate Engineering Students.</p>
<p>Discuss on <a href="https://github.com/vtomole/vtomole.github.io/issues/11">Github</a></p>
My 3 hours on Rigetti’s quantum processor20180106T00:00:00+00:00https://vtomole.com//blog/2018/01/06/qpu<p><strong>Introduction</strong></p>
<p>Rigetti put their 19 qubit processor on the cloud at the end of last year. I was one of the first people to request access to it. I had the privilege of using this processor for 3 hours yesterday. Although I did not have a lot of time to play with all of the qubits that were on that computer, I had a lot of fun.</p>
<p><strong>Bell State</strong></p>
<p>The Bell State experiment for qubit 0 and qubit 1 was successful 76/100 times.</p>
<p><strong>GHZ</strong></p>
<p>The GHZ experiment is the entanglement of 3 qubits. Running this experiment on the QPU got me the correct result 49/100 times. The qubits I used were q0, q1 and q2. Since entangling qubits in hardware is a costly operation, it’s not surprising that this experiment performed worse than the Bell State.</p>
<p>UPDATE: The circuit program that I used for GHZ was not efficient. Rigetti ran the GHZ experiement with a better circuit and they got the correct result ~70/100 times. Thanks Ryan Karle!</p>
<p><strong>Grover’s algorithm</strong></p>
<p>I used q1 and q2 for 2 qubit implementations of Grover’s algorithm. Searching for 00 succeeded 84/100 times; 01 was a success 86/100 times; 10 81/100 times and 11 79/100 times. Running these algorithms on the QVM gave the correct answer 100% of the time. Grover’s algorithm is probabilistic, so the results from this processor was good because it gave the correct result with a high probability (82.5%).</p>
<p><strong>Future Work</strong></p>
<p>Even though I had a 19 qubit computer at my disposal, I only used 2 qubits. I would like to perform Grover’s algorithm on a database of 19 qubits in the future. I would also like to experiment with a deterministic quantum algorithm like Deutsch–Jozsa.</p>
<p>There has recently been some tremendous progress in the development of quantum algorithms. Especially with respect to machine learning [3] [4] [5]. I unfortunately don’t have the experience required to be able to convert the results of these papers to programs that I can run on a quantum computer. Maybe someday…</p>
<p>Discuss on <a href="https://github.com/vtomole/vtomole.github.io/issues/10">Github</a></p>
<p><strong>References</strong></p>
<p>0: <a href="https://github.com/QCHackers/qchackers/tree/master/pyquil">Code and results for the experiments</a></p>
<p>1: <a href="https://quantumexperience.ng.bluemix.net/proxy/tutorial/fulluserguide/introduction.html">IBM Q Full User Guide</a></p>
<p>2: <a href="http://pyquil.readthedocs.io/en/latest/qpu_overview.html">The Rigetti QPU</a></p>
<p>3: <a href="https://www.nature.com/articles/s4153401700173">Demonstration of quantum advantage in machine learning</a></p>
<p>4: <a href="https://medium.com/rigetti/unsupervisedmachinelearningonrigetti19qwithforest1239021339699">Unsupervised Machine Learning on Rigetti 19Q with Forest 1.2</a></p>
<p>5: <a href="https://arxiv.org/abs/1712.05304">A quantum algorithm to train neural networks using lowdepth circuits</a></p>
Into Recurrent Neural Networks20160625T00:00:00+00:00https://vtomole.com//blog/2016/06/25/rnn<p> I have spent the past few weeks trying to understand Recurrent Neural Networks; either they are very simple to understand,
or I am missing something. What I know is that Recurrent Neural Networks(Rnns) are meant to keep track of sequences.
This makes Rnns good for machine translation, speech recognition, and natural language processing. </p>
<p>Recurrent Neural Networks are recursive, hence their name. They are composed of cells.
These cells have the basic components of neural networks in that they are just nodes that are connected together
just like normal neural networks. The difference is that recurrent neural networks can also connect to themselves,
hence where the name “recurrent” comes from. This makes it so that unlike a basic or a convolution neural network,
a recurrent neural network has memory. For example, a Recurrent Neural Network can be used to predict a word based on previous words.
There is one glaring problem for Rnns, the vanishing gradient problem. The vanishing gradient problem is when training a neural network
becomes more difficult as the Rnn gets bigger. </p>
<p>This is why we need to use Long Short Term Memory Networks (LSTMs),
These networks can hold memory for a very long time. LSTMs are like Rnns where they can hold a state that can be updated;
and there are functions inside the neural network that holds the current inputs and the previous outputs.
But an LSTM can hold memory for a longer period of time. An LSTM cell can decide on how long to hold a value and how long to
forget it based on how important that particular value(memory) is. This is a ingenious method of trying to simulate human memory.</p>
<p>Rnns are networks that that can take inputs and output values based on activation functions like normal neural networks,
the difference is that (Rnns) can connect to themselves in a recursive manner. This makes them capable of predicting
outputs based their own inputs (having a memory). Normal Rnns don't have good memory.
That is why LSTMs are used in practice because they can hold memory for as much time as they need to.</p>
<p> Discuss on <a href = "https://github.com/vtomole/vtomole.github.io/issues/9">Github</a></p> My New Computer20160606T00:00:00+00:00https://vtomole.com//blog/2016/06/06/computer<p>Last week; I learned a huge lesson, machine learning is costly. I paid around 1500 dollars for my new PC build.
There were 2 reasons I got this PC. The first reason is because my PCs are very old. I have 3 PCs that are from the early 2000s.
One of them does not even turn on anymore and the other two are so old that the only Linux they can run is a lightweight distribution. I clearly needed a new computer. I’ll still be using my old ones, for the nostalgia. Keep in mind that the 1500dollar price tag was only for the build. I still have my old monitor, keyboard and mouse that are over 10 years old. They work perfectly fine so I don’t see the need to waste anymore money. The PC has a 240Gb solid state drive and an msi 970 gaming motherboard.</p>
<p>The second reason that I got this PC was for machine learning, I haven’t run any machine learning algorithms on it yet, but I will pretty soon.
The reason I need a 1500 dollars to build this was to support the GPU (Graphics Processor Unit). GPU’s were originally invented for gaming.
They have thousands of cores that compute pixel values in parallel because gaming applications are graphics intensive. Since GPU’s do calculations in parallel, they are great for number crunching. The difference between a CPU and a GPU is that a CPU has a small number of cores (from 28). A GPU has thousands of cores, so a GPU is great for parallel computations.</p>
<p>There are two main driving forces of the recent artificial intelligence boom; the progress of hardware,
and the amount of data that we have access to.
Reinforcement Learning and Artificial Neural Networks are relatively old ideas that were invented and implemented in the mid to late 20th century.
In fact, Reinforcement Learning was used to beat human players in Backgammon in the 1990’s.
The reasons these ideas didn’t flourish is because of Moore’s law; and because we did not have enough data (You need a lot of data to train neural networks). With a lot of data, you also need sufficient hardware to do the matrix calculations for these neural networks in an efficient way. With the rise of the internet, a lot of data can be collected to train neural networks. In 1990, if you wanted millions of images to train neural network to recognize pictures, it would be very expensive to get millions of images, now you can just write a program that crawls Google Images and can get millions of images in a couple of days.</p>
<p>The progress of hardware is mainly driven by the GPU. The GPU is a minisupercomputer. It can do millions of calculations on its thousands of cores in parallel. A neural network that can take 2 days to train on a CPU can be trained on a GPU in 15 minutes. This means that if there is something wrong in the algorithm that you are running, you don’t have to wait a couple days to find out. With large enough data sets and large enough neural networks, sometimes training can take a couple of months on a CPU. In a GPU it only takes 1 or two days.</p>
<p>The GPU that I have is NVIDIA GeForce 970. It’s big, and it’s powerful. I was already tested it on Windows, but I have to test on Linux later.
Testing it on Windows was not difficult, NVIDIA has platform for programming GPUS’s called CUDA <a href ="http://www.nvidia.com/object/cuda_home_new.html"> here is the website</a>. They also make running your programs easy by integrating CUDA with Visual Studio. You can program in CUDA if you know C and C++, but you can also use python wrappers. I was able to write some simple programs like multiplying 2 matrices. The cool part was that I was running this program on my GPU! Not a CPU. I still don’t know anything about GPUs except the details of how different they are from CPU; but I know that the more I study CUDA architecture and learn how to implement my own machine learning algorithms, the more I will know about GPUs.</p>
<p> Discuss on <a href = "https://github.com/vtomole/vtomole.github.io/issues/8">Github</a></p> Neural Networks and Function Approximations20160606T00:00:00+00:00https://vtomole.com//blog/2016/06/06/fnapprox<p>Convolution neural networks are mainly used for computer vision. These Convolution Neural Networks, or Convnets, makes it so you don’t have to use a lot of parameters to train your network. These networks can be thought of as volumes because images have a width, height and a number of channels which are RGB (red, blue, green) values. When you multiply the height width and the color channel, you get a volume.</p>
<p>Professor Winston makes the general idea of convnets easy to understand, He says that to be able to train a neural network on an image using a convnet, you run a neuron on a small part of the image. You then run this neuron on another small part of the image and you keep doing this until you have run the neuron on every pixel in the image. The output of this neuron is composed of a specific place in the image. This is the convolution of the image. The convolution of this image is composed of a couple of points. You take the local maximum of these values and construct another image with these values. You then slide over the neuron of the image that you just created and do the same process that you did for the original image and get other points/values. This process is called max pooling because you are taking the maximum of the local points in the image. Min pooling is doing the opposite, where you take the local minimum of the values in the image.</p>
<p>The tool that you use to scan a small part of an image is called a kernel. This kernel will detect the features in the network. The great part about this process is you can collect as many kernels as you want, you can collect 50 kernels, 100 kernels and more. You then put these kernels through your trained neural network and your neural network decides on whether the image is a dog, car, table, etc.</p>
<p>The most significant lesson that I learned this week is that neural networks are just elaborate function approximation tools. That is the only reason they exist, to approximate functions.
When you are performing supervised learning, you have a function that you know, but your artificial neural network doesn’t.
You then tweak the weights and biases of this network until your network’s function is exactly the same one that you have.
Michael Nielsen give some great examples of how neural networks compute any function. You can find that material here.
Since artificial neural networks are used to approximate functions with as little error as possible, they are great for reinforcement learning.
The Backpropagation Algorithm, specifically The Gradient decent is used to find the minimum of the function so that the error is as small as possible. </p>
<p>To make these ideas concrete, suppose that you wanted to train a robot how to shake a hand, what you do is give it a lot of data of people shaking hands with a dummy robot.
This data would not specifically be videos, but only the data of the Markov states, the value functions and the amount of reward that the dummy robot is getting.
The robot would then; through trial and error, use this data to tweak its neural networks so that it could shake someone’s hand.
The best part would be that you wouldn’t have to program the robot to shake someone’s hand! It would learn on its own!
</p>
<p> Discuss on <a href = "https://github.com/vtomole/vtomole.github.io/issues/7">Github</a></p>
<h2>Reference:</h2>
<p>Winston, Patrick H. "Lecture 12B: Deep Neural Nets." MIT OpenCourseWare. MIT OpenCourseWare, 2015. Web. 30 May 2016.</p>
Into Markov Chains20160511T00:00:00+00:00https://vtomole.com//blog/2016/05/11/markov<p>
The last post talked about Markov Chains, so the next step is to use these chains to implement a reward process that is the basis for Reinforcement Learning. The reward process (Specifically the Markov Reward Process) is composed of a state space, a state transition matrix, a reward function, and a discount factor.
The discount factor just represents how much importance the agent puts on getting a reward immediately versus getting a reward later. When the discount factor is 0, the agent will do an action that will give it a reward immediately, when the factor is 1, then the agent places an importance on delayed gratification. For example, if someone offers me a cake and my discount factor is 0, I am looking for an instant gratification, so I would eat the cake immediately. But when my discount factor is one, I politely say no to the cake and eat an apple instead. The second decision is better for my health so I was aiming for a long term gratification.</p>
<p>Every state in the state diagram should be labeled with a reward. This reward could be based on a point system. For the state diagram that I made in my last post, I would say that sleeping gives me a reward of 1 point, studying gives me a reward of 10 points, eating gives me a reward of 3 points and hanging out online is 14 points (because hanging out online tends to be a timewaster and it is the most addictive compared to other states on this diagram). The reward and discount factor can be mixed into one equation that keeps track of how much the agent is getting as it moves from one state to another. This equation is called the return. In any Markov Reward Process, we want to maximize our return.</p>
<p>Now that we have defined the return, we can move on to the value function. the value function represents the value of a function after many state transitions. It is dependent on the discount factor. If the discount factor is 0, then the value of that state is the same as the reward of that state since the agent wants an instant reward. But as the discount factor moves away from 0, the state value of a certain state will not be the same as the reward of that state since the long term reward will be of a higher priority to the agent.</p>
<p>You can make a Bellman function using the properties of a value function. The Bellman function states that the value of a state is based on the reward of that state added with the values of the outgoing states multiplied with the probabilities of coming out of your state and going to the next states. Simply put, the Bellman equation is: v = r + d + p * v where v is the value, r is the reward, d is the discount, and p is the transition probability matrix. The Bellman equation is very important because it breaks a big problem into “bitesized” sub problems that we are capable of solving.</p>
<p>
The concept of Reinforcement Learning is thankfully not complicated at all. In my last post, I was wondering how I was going to implement a reward process for my agent.
David Silver’s second Reinforcement Learning lecture answered that.
He said that all Reinforcement Learning problems can be conceptualized in the form of Markov Chains.
An awesome tutorial for Markov chains that I used is <a href ="http://setosa.io/ev/markovchains/">here </a>.</p>
<p>A Markov Chain is a process of a state space (just a list of states) that can be drawn as a graph.
The process of a Markov Chain is based on a probability of a cursor moving from one state to another.
I am familiar with state spaces because I studied Finite State Machines last semester. For my class final project,
I created a state machine and I used it to implement a random number generator.
The good thing about Markov Chains is that it does not need to remember all of the past events, it only uses that current event to decide the next state.
This is called the Markov Property.</p
<p>The Markov Chain can be modeled in 2 ways, with directed graphs; where vertices represent the state and the edges represent the probabilities, and a transition matrix.
The transition matrix is used more often because a Markov Chain can get very complex.
These complexities can get so big that it may be impossible or tedious to model a Markov Chain using a graph.
This is why a transition matrix is almost always used to model Markov Chains. "Every state in the state space is included once as a row and again as a column. And each cell in the matrix tells you the probability of transitioning from its row state to its column state." Powell.
</p>
<p>I won’t get into the specifics of transition matrix mathematics (because I don't know any of it yet), but I will create transition matrix.
This transition matrix models my day to day life in the summer. The main activities that I do in the summer is sleep, eat, study, and hang out online.
I am going to use numbers between 0 and 1 to model the probabilities where a 10 percent chance is 0.10 and a 50 percent chance is .50.
I created this transition matrix by first creating a state diagram, but I am unable to put a state diagram on here at the moment.
My state diagram starts from sleeping and it goes on from there. Every outgoing edge from any one of my states should add up to 1 because all probabilities are supposed to add up to 100 percent.</p>
<p>The transition matrix below conceptualizes a Markov Chain beautifully. It shows all of the transitions from a state to other states, for example the probability that I go from sleeping to sleeping again is 0.1, the probability that I go from eating to hanging out online is .75. The columns of the matrix are supposed to represent the arrows going from each state so each column should add up to 1 because there is 100 percent probability that something will happen no matter what state I am in.</p>
<table border="2">
<tr>
<td></td>
<td>Sleeping</td>
<td>Eating</td>
<td>Studying</td>
<td>Hanging out online</td>
</tr>
<tr>
<td>Sleeping</td>
<td>0.1</td>
<td>0.25</td>
<td>0.50</td>
<td>0.075</td>
</tr>
<tr>
<td>Eating</td>
<td>0.30</td>
<td>0.05</td>
<td>0.10</td>
<td>0.75</td>
</tr>
<tr>
<td>Studying</td>
<td>0.02</td>
<td>0.175</td>
<td>0.30</td>
<td>0.15</td>
</tr>
<tr>
<td>Hanging out online</td>
<td>0.67</td>
<td>0.75</td>
<td>0.10</td>
<td>0.70</td>
</tr>
</table>
<p>The next step is figuring out how to model the reward process using Markov Chains.</p>
<p>
There are three methods of solving a Markov Decision process, these methods are, Dynamic Programming, Monte Carlo, and Temporal Difference Learning.</p>
<p>Dynamic Programming is a method where you know the full Markov Decision Process. The pros of this method, is that there is a specific algorithm that you can use to find your value function. The way this method is taught is “grid world” This is a grid that contains the value functions and determines the optimal policy that each state will contain until the state terminates. I am still trying to understand this grid world, so I can’t go into the specifics aspects of it.</p>
<p>Reinforcement learning depends on optimizing value functions. There also some situations where you don’t know what your Markov Decision Process is. This is an area that is suited for the Monte Carlo Method. The Monte Carlo Method relies on random sampling and it is also used for numerical integration. This random sampling is akin to the agent trying out everything in its environment and finding out what maximizes it’ s value function. This means that the Monte Carlo method learns from experience.</p>
<p>Temporal Difference Learning is a fundamental concept in reinforcement learning because it combines Dynamic Programming and the Monte Carlo Method. This method, like Monte Carlo also learns from experience, and it doesn’t have a Markov Decision Process. Unlike the Monte Carlo Method, Temporal Difference learning does not need the state to terminate to get the value function. Like Dynamic Programming, Temporal Difference Learning updates the value functions on the fly, the episode does not have to terminate. This is called bootstrapping.</p>
<p>The methods that solve a Markov Decision Process have the same steps. The first step is the policy evaluation that estimates the value function for a policy. The second step is finding the optimal policy, which is improving the policy when you have a value function.
</p>
<h2>References:</h2>
<p>Powell, Victor. "Markov Chains Explained Visually." Explained Visually. N.p., n.d. Web. 11 May 2016.</p>
<p>Silver, David. "Markov Decision Processes." SpringerReference (n.d.): n. pag. 2015. Web. 5 May 2016. </p>
<p>Sutton, Richard S., and Andrew G. Barto. Reinforcement Learning: An Introduction. Cambridge, MA: MIT, 1998. Print. </p>
</p>Into Reinforcement Learning20160509T00:00:00+00:00https://vtomole.com//blog/2016/05/09/reinforcement<p>
The basic premise of this type of learning is that an agent only does things that can maximize its reward. The agent doesn’t know if what its doing is good or bad, you just give it a reward if it does what you want. This is the same concept that can also be applied to animals, a classic example of this type of learning is Pavlov’s dog. Another example is the way children learn. Children get punished when they don't do what an authority figure wants of them, and they get rewarded when they do what an authority figure wants.</p>
<p> The same principle can be applied to an agent. An agent takes in an observation, a reward from the environment and then it outputs an action to the environment. The
agent that takes in these inputs and outputs can be thought of as a black box for now, but inside that black box, there are algorithms that calculate the output based on the rewards and the state of the environment that the agent is in. This selfperpetuating cycle is another type of machine learning.
</p>
<p>To get started on Reinforcement Learning, I wrote a program that guesses a random number and terminates if the number
that was guessed is correct, the program is extremely simple, if fact this program is not AI related whatsoever. It's just random number checker, but it's a start nonetheless. It goes against the true nature of reinforcement learning but it gets some of the basics correct. The random function guesses a number, if the number that is guessed is not 9, then the program keeps guessing a number until it guesses the correct one. The problem with this program is that I am not giving it a reward, and it’s not learning that 9 is the correct answer. What I need to do is have a point system where if it guesses 9, I give it 1 point, if it guesses anything else, I subtract one point and my agent should learn that by guessing 9, it will earn more points.
</p>
<p> Discuss on <a href = "https://github.com/vtomole/vtomole.github.io/issues/4">Github</a></p>
<h2>References:</h2>
<p>Silver, David. "Lecture 1: Introduction to Reinforcement Learning." (n.d.): n. pag. 2015. Web. 5 May 2016. http://www0.cs.ucl.ac.uk/staff/d.silver/web/Teaching_files/intro_RL.pdf. </p>
Backpropagation using an alternative Activation Function20160430T00:00:00+00:00https://vtomole.com//blog/2016/04/30/backprop<p>The backpropagation algorithm has two parts, the ForwardPropagation and the Backpropagation. ForwardPropagation is the first half of Backpropagation that is used to find the loss of the neural network. The loss could be called an error because it shows the magnitude of error that our neural network outputs. </p>
<p>This error is calculated by multiplying our weights with our input and passing the product of these vectors to our activation function. The activation function is just supposed to take in the array of numbers and squish those number between the ranges of those activation function. This number is then stored in a hidden neural layer. It is used to calculate the loss (error) by subtracting the output from the values of the hidden neural layer (The number that our activation returned)</p>
<p>The second half of this algorithm is Backpropagation. Backpropagation is moving backwards on the network and (as far as I know right now), we multiply the hidden neural layer by the gradient of the state of what our weights are in. Once this is done, we change the weights to the dot product of our input array and the gradient of what we calculated during our Forward propagation (the second layer of the network, on a two layer network). Backpropagation is a heavy subject, so I’ll be spending more time studying it.</p>
</p>
<p>
The tutorial that I've been using to study neural networks uses the logistic function to predict the output of a truth table using three inputs. The table below shows the three inputs x,y, z and an output. As you can see, x is exactly the same as the output. These are the kinds of patterns that neural networks are supposed to recognize. This is also the same method that humans use to learn, by recognizing patterns. The difficulty is just how complex those patterns are.</p>
<p> The example that I was using, which is cited below; uses a logistic function to predict the output. I was wondering if I could get the same behaviour from a simple neural network using a different Activation Function. The Activation Function that I used is called the Tanh function. The Tanh function is explained well <a href ="http://mathworld.wolfram.com/HyperbolicTangent.html">here</a>. If you look at this function's graph, it looks similar to the logistic function except instead of having a range from 0 to 1, it has a range from 1 to 1. I don't know enough about neural networks to tell when you want to use a certain function over another, but I will find out soon enough! </p>
<table border="1">
<tr>
<td>x</td>
<td>y</td>
<td>z</td>
<td>output</td>
</tr>
<tr>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
</tr>
<tr>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
</tr>
<tr>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
</tr>
<tr>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
</tr>
</table>
<p> These were the results I got after training the neural network 100,000 times.
<table border="1">
<tr>
<td>Output After Training</td>
<td>Expected</td>
</tr>
<tr>
<td>0.99999703</td>
<td>1</td>
</tr>
<tr>
<td>0.99999702</td>
<td>1</td>
</tr>
<tr>
<td>0.99999702</td>
<td>1</td>
</tr>
<tr>
<td>0.99999703</td>
<td>1</td>
</tr>
</table>
<p>As you can see, the results are not completely perfect, but the network and the inputs were easy enough for them to be close, this shows that learning is never perfect and there is always some error that we have to account for in all our learning algorithms.
<p> Discuss on <a href = "https://github.com/vtomole/vtomole.github.io/issues/4">Github</a></p>
<h2>Reference:</h2>
<p>Trask, Andrew. "A Neural Network in 11 Lines of Python (Part 1)." Iamtrask. N.p., 12 July 2015. Web. 30 Apr. 2016. </p>
Basic Machine Learning20160423T00:00:00+00:00https://vtomole.com//blog/2016/04/23/basicml<p>Backpropagation is the standard Machine Learning algorithm. From what I understand, It works like this:</p>
<p>1. Set all the weights of your neural network to random values.</p>
<p>2. Take a training example, put it through your neural network and try to predict what your neural network will make of it.</p>
<p>3. Compare what your neural network’s output is with the output that you want (the correct output).</p>
<p>4. Adjust your weights.</p>
<p>5. Keep doing this over and over again until your network gets the output that you want. Once this happens, your neural network has learned.</p>
<p>This is called Supervised Learning because you can check if your network is correct or not by comparing the results of your network to your training examples.</p>
<p>The weights are adjusted using the Gradient Decent Algorithm. This is how the weights and biases are tweaked. The Gradient Decent is summarized well
<a href ="http://mathworld.wolfram.com/MethodofSteepestDescent.html">here</a>. It finds the local minimum of a function. This is when the slope of the function is zero. (In other words, when the derivative of the function is equal to 0.) The problem is you can only efficiently find the derivative of a function that has a few variables. Neural Networks can get very big, sometimes having millions of variables.
So computing the the derivatives of these types of functions is computationally intensive. That is why the Gradient Decent Algorithm is used for huge networks.
The next step for me is to write a program for the Gradient Decent Algorithm and maybe also a program that can compute general derivatives. </p>
<p>Summary: Backpropagation finds the error that your weights have by
comparing your neural network’s output to the output that you want you network to have and
the Gradient Decent Algorithm is used to adjust these weights.</p>
<p> Discuss on <a href = "https://github.com/vtomole/vtomole.github.io/issues/3">Github</a></p>
<h2>References:</h2>
<p>Michael A. Nielsen, “Neural Networks and Deep Learning”</p>
<p>Trask, Andrew. "A Neural Network in 13 Lines of Python (Part 2  Gradient Descent)."  I Am Trask. N.p., 27 July 2015. Web. 23 Apr. 2016.</p>
The Simplest Artificial Neuron20160415T00:00:00+00:00https://vtomole.com//blog/2016/04/15/neuron <p>
The human brain is a complicated system that is responsible for every single human accomplishment and endeavor. It is the best example we currently have of intelligence. This is the kind of intelligence that researchers are currently trying to emulate. If we are trying to understand the brain at it's most basic level, We can started by learning about neurons.
Neurons are the underlying cause of all human congnition and behavior. Neurons are also not unique to humans, most animals have neurons. These neurons communicate to each other with electrical signals and chemical processes.
The neurons in the nervous system communicate through axon terminals and dendrites. This means every neuron has an input and an output.</p>
<p>An artificial neuron is also structured in the same way (It has an input and an output). An artificial neuron has weight vector which determines how important an input is for the kind of output that we want from the neuron. The artificial neuron also has an activation function that gives us the actual output of the neuron. We take the dot product of the input vector and the weight vector to get the input that we need for our activation function.
<a href ="http://mathworld.wolfram.com/DotProduct.html">Here is a quick summary of the Dot Product.</a>
The activation function that is commonly used is The Sigmoid Function. <a href ="http://mathworld.wolfram.com/SigmoidFunction.html">Here is a quick summary of the Sigmoid Function.</a> </p>
<p>Summary: A simple Artificial Neuron is a Sigmoid Function whose input is the dot product of the weight vector and the input vector.</p>
<p> Discuss on <a href = "https://github.com/vtomole/vtomole.github.io/issues/2">Github</a></p>
<h2>Reference:</h2>
<p>Nielsen, Michael. "CHAPTER 1." Neural Networks and Deep Learning. VisionSmarts,Ersatz,G Squared Capital,TinEye, 22 Jan. 2016. Web. 15 Apr. 2016.
</p>