This is a burp of an idea. The roughness of the post reflects the lack of manners of the belch.
So there's this paper that I think is really important. It's new, but it consolidates some old points in a beautiful demonstration: https://arxiv.org/abs/1906.09012
Roads&Love demonstrate that the statistical distributions of concepts as they're elicited from text (amodal sources) are well-matched by the statistical distributions of concepts as they're elicited from perceptual sources (their demonstration used visual stimuli).
Okay lame and boring, what does it mean. I think this paper has huge and broad implications regarding communication and meaningful interaction. Insofar as communication requires the conceptual alignment of multiple agents, this paper implies that communication is actually possible between two agents.
Okay, why is it valuable to demonstrate that communication (here, I mean the conceptual alignment of multiple agents) is actually possible. Isn't it obvious that communication is possible (here, I would cite and recite Quine and Putnam). The answer from philosophers would be no, it isn't self evident that multiple agents can align their conceptual systems such that communication is possible. Hence the value of an empirical demonstration.
But more importantly, this study illuminates the constraints by which such an alignment could occur. Namely, two amodal conceptual systems could align insofar as they're based on the same underlying system of concepts. The fact that Roads&Love could align an amodal text-based system of concepts and a perceptual image-based system of concepts is due to the fact that the text upon which the amodal system of concepts is based emerges from language use that emerges from perception. The development of text is inherently grounded in perception, and therefore the two systems can be aligned.
Extending this to communication, the extent two which two agents' conceptual systems are grounded in the same subset of reality (assuming that reality is the thing that is sampled for the construction of conceptual systems) determines the extent to which these systems can align, and thereby the extent to which communication can occur.
What are the constraints of communication? A shared set of perceptual experiences - in order for communication to happen between agents, a conceptual system grounded in the same set of perceptual experiences is necessary. Between agents of the same kind (e.g. humans), the majority of the work is carried by the perceptual systems being nearly identical - identical perceptual systems means that the agents have access to the same subset of reality as one another. Perceptual systems that are too highly varied may yield perceptually based concept systems that are not easily aligned.
But even with shared perceptual systems, the experiences accrued by an agent may be such that the yielded conceptual systems of that agent may not be well-aligned to another agent. La vie.
But moreover, since the most interesting this is the conceptual alignment between agents that can communicate, we need to talk about the conceptual systems that emerge from amodal sources, such as language or text.
(Language serves as a means of sharing conceptual systems)
If the amodal conceptual systems between two agents are not well-aligned, it might be for two (or more?) reasons. First, it might be because the perceptual experiences accrued by the agents from which the linguistic conveyance of amodal conceptual systems are derived are thoroughly dissimilar (can communication happen in this case? it's hard to say); in this case, even though linguistic inputs are well-aligned to perceptual experiences, conceptual alignment becomes challenging.
Second, it might be that, even though the perceptual experiences of the agents are aligned, the construction of the linguistic inputs for one or both of the agents emerge from a different statistical distribution than that which is accessible from perception (honestly, I think this is an interesting case, and potentially a source of ought).
So then, how do we increase the degree to which communication can occur between agents? If shared perceptual experiences generate conceptual systems that can be aligned, then as shared experiences between agents increases, the degree of possible communication between agents increases (likewise, as shared perceptual experiences decrease, the ability for agents to communicate decreases).
But this is dependent on the degree to which the amodal conceptual systems align, since communication occurs through amodal avenues. If amodal conceptual systems emerge along a statistical structure that do not align with the statistical structure of that which can be perceived, then the potential of communication occurring decreases, since the conceptual systems cannot be aligned.
This is really a pedantic way of saying that communication requires shared experiences, mediated by overlapping ideologies. But it provides a framework where these constructs can be formalized and empirically tested
This makes a large claim about the degree to which communication can occur based on conceptual alignment. Can empirically test this. Find conceptual alignment of participants using like a similarity rating task for items in a particular domain (politics and religion are salient, could also do scientific topics, or other domains of expertise). Similarity rating task yields the conceptual system for a particular participant. Pair off participants and have them discuss topics within a particular domain for some alotment of time (10 minutes?). After discussion, have participants do an exit survey. Ask questions about how well they understood the other person's perspective, how well they conveyed their ideas, how well the other person understood what they said, whether they had to dumb down or hold back while discussing their position (did they hold back because they thought the other person wouldn't get it?)
"Would you like a banana?"
"I'd like half a banana."
"You'd like to halve a banana."
"Yes, I'd like to have half a banana. Could you halve the banana for me?"
"I'll halve the banana so you can have half the banana. ... You had that half banana pretty fast."
"Yes, I have had a halved banana."
"Have you enjoyed your halved banana?"
"I have enjoyed it. I might enjoy the other half."
"You have had a halved banana. You could have the other half if you'd like."
"I would like to have more banana. Could you halve it for me?"
"I have halved the half banana. Have it."
"Thank you for halving the halved banana for me."
"You must like that banana. You now have had a halved half banana."
"The banana is quite delicious. Would you like to have the halved half remaining banana?"
"I'll halve it so we both can have it."
"You have had a halved halved half banana."
"And you have had a half banana, and a halved half banana, and a halved halved half banana. Have you enjoyed it?"
Or just a convenient plot device?
She kinda seems like a Ravenclaw to me.¶
I was playing around with the Harry Potter books (for a post that is yet to be released), when the realization hit me: I could use the computational power of semantic models to figure out whether Hermione really belongs to Gryffindor, or if she's actually a Ravenclaw.
Here's a rough outline of the process.
import nltk import numpy as np import matplotlib.pyplot as plt import matplotlib.axes as axes from itertools import compress import pickle from scipy import sparse from scipy.sparse.linalg import svds as svd
In [ ]:
# load file def read_file(nums): text = '' for num in nums: with open('../../../corpus/HarryPotter/text/hp' + str(num) + '.txt', 'rt',encoding='utf-8') as file: for line in file: text = text + line return text # read in a file book_content = read_file(range(1,8)) # load character names and houses with open('character_words.p', 'rb') as file: character_words = pickle.load(file)
In [ ]:
# tokenize on word tokenized = nltk.word_tokenize(book_content) #lowercase each word tokenized_lower = [i.lower() for i in tokenized]
In [ ]:
frequencies = nltk.FreqDist(tokenized_lower)
Reducing the corpus¶
Here's a plot showing the 5000 most frequent words used in Harry Potter. The shape of the curve is an inverse exponential, commonly referred to as a Zipfian Distribution. Simply put, most words don't occur that commonly, but the frequencies of a few words vastly outweighs that of the rest. These words (referred to as stop words) are removed from a corpus - their overwhelming frequency often prevent semantic models to pick up on any meaningful associations between words.
y = list(zip(*frequencies.most_common()))[:5000] x = np.arange(1,len(y) + 1) plt.plot(x,y) plt.ylim([0,4000]) plt.axvline(200, color = 'red') plt.show()
In [ ]:
# find the 50 most common words from the corpus most_common_words = list(zip(*frequencies.most_common(200))) # keep character names though most_common_words = list(set(most_common_words) - character_words)
In [ ]:
# find the words that only occur once words,freqs = list(zip(*frequencies.most_common())) least_common_words = list(compress(words, [bool(i <= 5) for i in freqs]))
In [ ]:
# remove most and least common words from the corpus stop_listed_text = [i for i in tokenized_lower if (i not in most_common_words) & (i not in least_common_words)]
print('Length of original corpus: ' + str(len(tokenized_lower))) print('Length of reduced corpus: ' + str(len(stop_listed_text))) print(str(np.round((1 - len(stop_listed_text) / len(tokenized_lower)) * 100, 2)) + '% reduction')
Length of original corpus: 1392955 Length of reduced corpus: 447443 67.88% reduction
In [ ]:
with open('stop_listed_text.p', 'wb') as file: pickle.dump(stop_listed_text, file)
with open('stop_listed_text.p', 'rb') as file: stop_listed_text = pickle.load(file)
# make a scrolling window - width=4 words # we're going to do a tapered window windowed_text =  length_window = 10 for i in np.arange(len(stop_listed_text) - length_window): windowed_text.extend(stop_listed_text[i:i + length_window])
# make a dictionary words = sorted(list(set(stop_listed_text))) dictionary = dict(zip(words, np.arange(len(words))))
# convert text to a word by document matrix data = 1 - np.arange(0,1,1/length_window) data = np.tile(data, len(stop_listed_text) - length_window) rows = np.arange(len(stop_listed_text) - length_window) rows = np.repeat(rows, length_window) cols = [dictionary[word] for word in windowed_text] wd = sparse.csr_matrix((data, (rows, cols)), shape=(len(stop_listed_text) - length_window, len(dictionary)))
# reconstruct matrix along word and reduced document dimensions # get ww ww = np.dot(wd.transpose(),wd).tolil() ww = np.array(ww.transpose() / wd.sum(0)).transpose() #save this # with open('ww_hp.p','wb') as file: # pickle.dump(ww,file)
# for each house name and each character name, set all house and character names to 0 with open('character_words.p','rb') as file: character_words = pickle.load(file) house_character_names = list(set(words).intersection(character_words)) for i in house_character_names: for j in house_character_names: ww[dictionary[i],dictionary[j]] = 0
def cosineTable(vects): return np.dot(vects,vects.transpose()) / np.outer(np.sqrt(np.sum(vects*vects,1)),np.sqrt(np.sum(vects*vects,1)))
names = ['ravenclaw', 'gryffindor', 'hufflepuff', 'slytherin', 'harry', 'cho', 'cedric', 'draco', 'justin'] rows = [dictionary[i] for i in names] test = ww[rows] values = cosineTable(test)[:4,4:]
labels = names[4:] ravenclaw = values gryffindor = values hufflepuff = values slytherin = values x = np.arange(len(labels)) # the label locations width = 0.35 # the width of the bars fig, ax = plt.subplots() ax.bar(x - (width/4 * 2), gryffindor, width, label='Gryffindor',color='#ae0001') ax.bar(x - (width/4 * 1), ravenclaw, width, label='Ravenclaw', color='#033e8c') ax.bar(x + (width/4 * 1), hufflepuff, width, label='Hufflepuff',color='#ffdb00') ax.bar(x + (width/4 * 2), slytherin, width, label='Slytherin',color='#2c8309') # Add some text for labels, title and custom x-axis tick labels, etc. ax.set_ylabel('Characters') ax.set_title('Similarity between House and Character') ax.set_xticks(x) ax.set_xticklabels(labels) ax.legend() fig.tight_layout() plt.show()
The figure above plots the similarities between the vectors representing each character and each house. The x-axis is the character, the y-axis is the cosine similarity (you'd better be impressed by the colors - I scoured the internet for the best hex-codes).
I wanted to make sure that the model could correctly place a character in each house - Harry (obviously) has to be placed in Gryffindor, I chose Cho Chang to represent Ravenclaw, Draco to represent Slytherin, and Cedric Diggory to represent Hufflepuff. I was a little annoyed that the model placed him in Gryffindor (I guess he wasn't that great of a finder after all), so I had to dig deeper to find another Hufflepuff to validate my model. I settled on Justin FinchFletchley as representative (oh come on, you know who he is.. he's the one who did the one thing that one time...). Anyway, the model can correctly identify the houses for representatives for each house.
With the model modestly validated, let's see how it sorts Hermione:
names = ['ravenclaw', 'gryffindor', 'hufflepuff', 'slytherin', 'hermione'] rows = [dictionary[i] for i in names] test = ww[rows] values = cosineTable(test)[:4,4:]
labels = names[4:] ravenclaw = values gryffindor = values hufflepuff = values slytherin = values x = np.arange(len(labels)) # the label locations width = 0.35 # the width of the bars fig, ax = plt.subplots() ax.bar(x - (width/4 * 2), gryffindor, width, label='Gryffindor',color='#ae0001') ax.bar(x - (width/4 * 1), ravenclaw, width, label='Ravenclaw', color='#033e8c') ax.bar(x + (width/4 * 1), hufflepuff, width, label='Hufflepuff',color='#ffdb00') ax.bar(x + (width/4 * 2), slytherin, width, label='Slytherin',color='#2c8309') # Add some text for labels, title and custom x-axis tick labels, etc. ax.set_ylabel('Characters') ax.set_title('Similarity between House and Character') ax.set_xticks(x) ax.set_xticklabels(labels) ax.legend() # fig.tight_layout() plt.show()
And there we have it. Despite her bookish nature and the cleverness of her wand hand, Hermione is clearly a Gryffindor (though, if she were anything else, it would obviously be a Ravenclaw).
Also, let's be amused by Draco for a moment. It's not that he's more Slytherin than anyone else. It's just that he's less of anything else than Slytherin.
These posts are mostly just fun things for me to do. They're works in progress (even after posted). If you think there's something I should look at, comment, and I'll add it to my to do list.