Intro & Overview

Hello There! Welcome To This Blog.

I’m Shiyi. I’m deeply passionate about the intricate dance between data and innovation. With a fervent zeal for leveraging technology to extract insights and create a meaningful impact, I’ve embarked on a journey that spans the realms of Data Science, research, and creative expression.

Q What drives my work in Data Science and Machine Learning?

A At the heart of my endeavors lies a profound appreciation for machine learning, deep learning, and Natural Language Processing. As an advocate for data-driven decision-making, I thrive on unraveling the complexities of algorithms and patterns, harnessing their power to transform raw data into actionable intelligence. From predictive modeling in finance to image recognition tasks using deep learning architectures, I relish the challenge of pushing the boundaries of what’s possible with data.

Q What is my expertise in Natural Language Processing?

A My expertise extends to the captivating domain of Natural Language Processing. In an era inundated with information, I’m committed to empowering systems to understand, analyze, and generate insights from vast textual data. Whether it’s sentiment analysis to decipher the mood of social media conversations or language translation to bridge communication gaps, I’m fascinated by the potential of NLP to revolutionize how we interact with language.

Read More

Markov Processes

Discussions on Markov Processes Continued

In a different blog, I noted the use of a markov processes in the context of natural language processing. Now in this blog, we will be going through some important details with regard to the concept.

We will go through some code in the subsequent paragraph with respect to how to simulate Markov Chain in coding.

Markov Chain Basics

A Markov chain is a mathematical system that undergoes transitions from one state to another within a finite or countable number of states. It is a stochastic process that satisfies the Markov property, which states that the future state depends only on the current state and not on the sequence of events that preceded it.

Components of a Markov Chain

  1. States: The different possible conditions or configurations the system can be in.

Read More

Problem Solving

More on Logic And Problem Solving

In a different blog, I have briefly introduced some of the most important concepts of logic and problem solving, including but not limited to predicate calculus, propositional logic, and lambda calculus.

In this blog, the notes will be more in detail and introduce relevant ideas.

Defining Entailment, Implicatures, and Presuppositions

Implicatures: What’s suggested in an utterance, even though it is not explicitly stated or entailed by the utterance.

Entailment: Entailment is a relationship between statements where one statement necessarily follows from another. If statement A entails statement B, then if A is true, B must also be true.

Presuppositions: A presupposition is an assumption that as speaker makes about what the listener already knows or believes to be true. It’s information taken for granted in the utterance.

Read More

Jacobian Matrices

Discussions on Jacobian Matrices Continued

This blog will break down and continue explaining Jacobian matrices and Taylor expansions in plain language and explore how they are connected.

Jacobian Matrix

What it is:

  • Imagine you have a function that takes multiple inputs and gives multiple outputs. For example, you might have a function that takes two numbers (like coordinates $x$ and $y$) and gives back two other numbers.
  • The Jacobian matrix is a way to capture how small changes in each input affect each output.
Read More

Viterbi Algorithm

This Blog Will Explain The Mechanism of The Viterbi Algorithm

In this blog, we will introduce the Viterbi Algorithm explanation along with a Python code demonstration for a sequence prediction task.


Viterbi Algorithm: Explanation and Code Demonstration

The Viterbi Algorithm is a dynamic programming technique used to find the most probable sequence of hidden states in a Hidden Markov Model (HMM). It’s widely applied in sequence prediction tasks like speech recognition, natural language processing, and bioinformatics.

In this post, we’ll not only break down the algorithm’s mechanism but also provide a practical Python code demonstration to predict hidden states based on observed data.

Components of the Viterbi Algorithm

Before diving into the algorithm, let’s review the core components of a Hidden Markov Model:

Read More

Variational Families

Introducing The Gists of Variational Autoencoders (VAEs)

What is a Variational Autoencoder (VAE)?

Imagine you have a magical machine that can take a picture of your favorite toy, turn it into a secret code, and then use that code to recreate the toy’s picture. This is kind of what a Variational Autoencoder (VAE) does, but instead of toys, it works with things like images, sounds, or even words.

A VAE is a type of artificial brain (or neural network) that learns to compress data (like a picture) into a simpler form and then uses that simple form to recreate the original data. The “variational” part means that it doesn’t just learn one way to represent the data, but many possible ways, which helps it be more flexible and creative.

Math Foundations: Breaking it Down

Let’s keep things simple. Imagine you have a bunch of colored balls, and you want to sort them by color. First, you need to pick a way to describe each color with a number. This number is like the “code” that the VAE creates. Then, once you have the code, you need to figure out how to recreate the exact color from the number.

  1. Encoder: This part of the VAE is like a friend who looks at the color of the ball and writes down a secret code (a number) that represents that color.

Read More

Mutual Information

More on Mutual Information

Below is the code for WAPMI or the Weighted Average Point-wise Mutual Information. And this code measures the distance between two probabilities distributions.

It is a method used in computational linguistics to measure the strength of association between words in a given context, typically in the analysis of text data.

Imagine you have a box of different colored marbles, and you want to know which colors tend to appear together. WAPMI helps you figure out how often certain words (or colors) appear together more often than by random chance. It’s like a smart way to understand word relationships in sentences!

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
import math
import collections

def calculate_pmi(joint_prob, marginal_prob1, marginal_prob2):
"""
Calculate the pointwise mutual information (PMI) between two words.

:param joint_prob: The joint probability of the two words
:param marginal_prob1: The marginal probability of the first word
:param marginal_prob2: The marginal probability of the second word
:return: The PMI score
"""
if joint_prob == 0 or marginal_prob1 == 0 or marginal_prob2 == 0:
return 0 # Avoid division by zero
return math.log(joint_prob / (marginal_prob1 * marginal_prob2), 2)

def calculate_pmi_corpus_optimized(corpus):
"""
Calculate the PMI scores for all pairs of words in a corpus.

:param corpus: The corpus of text
:return: A dictionary of PMI scores
"""
word_counts = collections.defaultdict(int)
cooccurrence_counts = collections.defaultdict(int)
total_sentences = len(corpus)

# Precompute word counts and co-occurrence counts
for sentence in corpus:
unique_words = set(sentence) # Avoid counting duplicates within the same sentence
for word in unique_words:
word_counts[word] += 1
for word1 in unique_words:
for word2 in unique_words:
if word1 != word2:
cooccurrence_counts[(word1, word2)] += 1

# Calculate PMI scores
pmi_scores = {}
for (word1, word2), joint_count in cooccurrence_counts.items():
joint_prob = joint_count / total_sentences
marginal_prob1 = word_counts[word1] / total_sentences
marginal_prob2 = word_counts[word2] / total_sentences
pmi = calculate_pmi(joint_prob, marginal_prob1, marginal_prob2)
pmi_scores[(word1, word2)] = pmi

return pmi_scores

# Example usage
corpus = [
["this", "is", "a", "foo", "bar"],
["bar", "black", "sheep"],
["foo", "bar", "black", "sheep"],
["sheep", "bar", "black"]
]

pmi_scores = calculate_pmi_corpus_optimized(corpus)
print(pmi_scores)

Read More