Intro & Overview

Hello There! Welcome To This Blog.

Hello there! I’m Shiyi — a data enthusiast, researcher, and creative mind with a passion for the intricate dance between data and innovation. My journey spans the realms of Data Science, Machine Learning, and Natural Language Processing (NLP), all fueled by a drive to use technology to uncover insights and make a meaningful impact.

Q What drives my work in Data Science and Machine Learning?

A At the heart of everything I do is a deep appreciation for algorithms and the stories hidden within data. Whether I’m building predictive models in finance or developing image recognition systems using deep learning, I thrive on transforming raw data into actionable intelligence.

Q What is my expertise in Natural Language Processing?

A My main research focus is in NLP — a field that lets machines understand and generate human language. From sentiment analysis of social media conversations to machine translation, I’m fascinated by how language technologies are reshaping the way we communicate and understand the world.

Read More

Markov Processes

Discussions on Markov Processes Continued

In a different blog, I noted the use of a markov processes in the context of natural language processing. Now in this blog, we will be going through some important details with regard to the concept.

We will go through some code in the subsequent paragraph with respect to how to simulate Markov Chain in coding.

Markov Chain Basics

A Markov chain is a mathematical system that undergoes transitions from one state to another within a finite or countable number of states. It is a stochastic process that satisfies the Markov property, which states that the future state depends only on the current state and not on the sequence of events that preceded it.

Components of a Markov Chain

  1. States: The different possible conditions or configurations the system can be in.

Read More

Problem Solving

More on Logic And Problem Solving

In a different blog, I have briefly introduced some of the most important concepts of logic and problem solving, including but not limited to predicate calculus, propositional logic, and lambda calculus.

In this blog, the notes will be more in detail and introduce relevant ideas.

Defining Entailment, Implicatures, and Presuppositions

Implicatures: What’s suggested in an utterance, even though it is not explicitly stated or entailed by the utterance.

Entailment: Entailment is a relationship between statements where one statement necessarily follows from another. If statement A entails statement B, then if A is true, B must also be true.

Presuppositions: A presupposition is an assumption that as speaker makes about what the listener already knows or believes to be true. It’s information taken for granted in the utterance.

Read More

Jacobian Matrices

Discussions on Jacobian Matrices Continued

This blog will break down and continue explaining Jacobian matrices and Taylor expansions in plain language and explore how they are connected.

Jacobian Matrix

What it is:

  • Imagine you have a function that takes multiple inputs and gives multiple outputs. For example, you might have a function that takes two numbers (like coordinates $x$ and $y$) and gives back two other numbers.
  • The Jacobian matrix is a way to capture how small changes in each input affect each output.
Read More

Viterbi Algorithm

This Blog Will Explain The Mechanism of The Viterbi Algorithm

In this blog, we will introduce the Viterbi Algorithm explanation along with a Python code demonstration for a sequence prediction task.


Viterbi Algorithm: Explanation and Code Demonstration

The Viterbi Algorithm is a dynamic programming technique used to find the most probable sequence of hidden states in a Hidden Markov Model (HMM). It’s widely applied in sequence prediction tasks like speech recognition, natural language processing, and bioinformatics.

In this post, we’ll not only break down the algorithm’s mechanism but also provide a practical Python code demonstration to predict hidden states based on observed data.

Components of the Viterbi Algorithm

Before diving into the algorithm, let’s review the core components of a Hidden Markov Model:

Read More

Variational Families

Introducing The Gists of Variational Autoencoders (VAEs)

What is a Variational Autoencoder (VAE)?

Imagine you have a magical machine that can take a picture of your favorite toy, turn it into a secret code, and then use that code to recreate the toy’s picture. This is kind of what a Variational Autoencoder (VAE) does, but instead of toys, it works with things like images, sounds, or even words.

A VAE is a type of artificial brain (or neural network) that learns to compress data (like a picture) into a simpler form and then uses that simple form to recreate the original data. The “variational” part means that it doesn’t just learn one way to represent the data, but many possible ways, which helps it be more flexible and creative.

Math Foundations: Breaking it Down

Let’s keep things simple. Imagine you have a bunch of colored balls, and you want to sort them by color. First, you need to pick a way to describe each color with a number. This number is like the “code” that the VAE creates. Then, once you have the code, you need to figure out how to recreate the exact color from the number.

  1. Encoder: This part of the VAE is like a friend who looks at the color of the ball and writes down a secret code (a number) that represents that color.

Read More

Mutual Information

More on Mutual Information

Below is the code for WAPMI or the Weighted Average Point-wise Mutual Information. And this code measures the distance between two probabilities distributions.

It is a method used in computational linguistics to measure the strength of association between words in a given context, typically in the analysis of text data.

Imagine you have a box of different colored marbles, and you want to know which colors tend to appear together. WAPMI helps you figure out how often certain words (or colors) appear together more often than by random chance. It’s like a smart way to understand word relationships in sentences!

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
import math
import collections

def calculate_pmi(joint_prob, marginal_prob1, marginal_prob2):
"""
Calculate the pointwise mutual information (PMI) between two words.

:param joint_prob: The joint probability of the two words
:param marginal_prob1: The marginal probability of the first word
:param marginal_prob2: The marginal probability of the second word
:return: The PMI score
"""
if joint_prob == 0 or marginal_prob1 == 0 or marginal_prob2 == 0:
return 0 # Avoid division by zero
return math.log(joint_prob / (marginal_prob1 * marginal_prob2), 2)

def calculate_pmi_corpus_optimized(corpus):
"""
Calculate the PMI scores for all pairs of words in a corpus.

:param corpus: The corpus of text
:return: A dictionary of PMI scores
"""
word_counts = collections.defaultdict(int)
cooccurrence_counts = collections.defaultdict(int)
total_sentences = len(corpus)

# Precompute word counts and co-occurrence counts
for sentence in corpus:
unique_words = set(sentence) # Avoid counting duplicates within the same sentence
for word in unique_words:
word_counts[word] += 1
for word1 in unique_words:
for word2 in unique_words:
if word1 != word2:
cooccurrence_counts[(word1, word2)] += 1

# Calculate PMI scores
pmi_scores = {}
for (word1, word2), joint_count in cooccurrence_counts.items():
joint_prob = joint_count / total_sentences
marginal_prob1 = word_counts[word1] / total_sentences
marginal_prob2 = word_counts[word2] / total_sentences
pmi = calculate_pmi(joint_prob, marginal_prob1, marginal_prob2)
pmi_scores[(word1, word2)] = pmi

return pmi_scores

# Example usage
corpus = [
["this", "is", "a", "foo", "bar"],
["bar", "black", "sheep"],
["foo", "bar", "black", "sheep"],
["sheep", "bar", "black"]
]

pmi_scores = calculate_pmi_corpus_optimized(corpus)
print(pmi_scores)

Read More

Philosophy of Mind

Cognitive Science and The Philosophy of Mind

Q What is the focus of this blog?

A This blog will summarize articles, papers, and materials I have gone through that touch on the subject of Philosophy of Mind and how its presence lays important foundation for the development of general artificial intelligence.

The blog covers the following topics:

  • What Constitutes The Philosophy of Mind
Read More

Contemporary NLP

Introduction to Contemporary NLP

Q What is the importance of psychological concepts in NLP?

A To understand modern natural language processing (NLP), it’s essential to draw inferences from crucial psychological concepts like the Language of Thought Hypothesis and the Representational Theory of Mind. These concepts help explain how our brain processes and produces language and mental representations, which are foundational for NLP.

Language of Thought Hypothesis (LOTH)

Q What does the Language of Thought Hypothesis (LOTH) propose?

A LOTH proposes that our brain has a schema for producing language of thought, known as Mentalese. It suggests that mental states and thoughts have a structured, language-like format, which facilitates reasoning, problem-solving, and decision-making.

Q What are propositional attitudes in LOTH?

Read More