2014年11月14日星期五

People are infected by others.

Hello, this is the my forth blog about Social Media Analysis class.

Social media analysis is also related to Communication Studies. People are active in social network and they compose the abstract social network. We can also analysis the information flow through this angle.

In lecture 10, Rosanna talked about strong and weak ties. In our project we find that Weibo is a weak ties combination.  In Weibo, you can follow or delete following anyone’s Weibo unilaterally.  So the hot bloggers have to maintain this weak tie in many ways.

I have known the SIR model. The SIR model labels these three compartments S = number susceptible, I =number infectious, and R =number recovered (immune). This is a good and simple model for many infectious diseases.  It can also apply to information spread. Someone posts one piece of rumor and the believing person becomes the infectious one. The infectious person can correct his point after gets the truth and becomes the recovered part. These do not believe the rumor at the beginning is the immune group. It is not easy to be immune by the disease, and so does the rumor.

The picture bellows is the SIR Model. 




Recently,  a game named Plague Inc. is No.1 on apple store. I tried this game and find that is related to the SIR model. Your acted as a virus and your goal is to kill people on the world. At the beginning, the spreading speed is very slow. We can see the curve. It is like a 'S' curve.  By playing game, you can deeply feel the terrible spread of a disease.  What we can control is at the beginning, both the disease and the social media information.




2014年10月16日星期四

Interdisciplinary&Graph

After I learnt the fifth class, I find that the social analysis is an Interdisciplinary. It’s about Anthropology , Sociology ,Mathematics  and Psychology. I find an interesting software named: Gephi and there is a survey about what the research area of people who use this software. Gephi is an interactive visualization and exploration platform for all kinds of networks and complex systems, dynamic and hierarchical graphs. But people who use this software is not only in the computer related area, many of them are in Materials Science,Communication,History and even Architecture.

After we collecting data, we should focus on data analysis. There are many Algorithms and tools to use. The basic is graph theory that we convert the abstract relationship into a direct graph. 

This is related to our project. Our goal is to analysis different types phones from social media platform. Graph can help us to find the relationship between topics and followers. Maybe converting the process, we can find relationship between followers and topics. That’s mean: at the beginning, the company launches a topic about their product, and the followers  follow the topic and suppose we have the followers, how can we create the topic?  It makes me think about Six degree of Separation. This concept explain the reticular structure as our society which is focus on the connection between points. 

2014年10月1日星期三

About Sentiment Analysis

Hi,I am liruo, this is my second blog.

It makes me think about a movie named the social network. Adapted from Ben Mezrich's 2009 book The Accidental Billionaires: The Founding of Facebook, A Tale of Sex, Money, Genius, and Betrayal, the film portrays the founding of social networking website Facebook and the resulting lawsuits.

I notice that the websites attracts many people to sign up and the population growing exponentially. Everyone like this new idea and is willing to join in. And as it is known to us that people does not use this network to contact  with friends but also use this websites to do analysis.



In the class, Rosanna talked about sentiment analysis. That means we can use the tools to analyze  what the person’s feeling and it seems that computer is semi-intelligent. In my further reading, I get the notion that, the model bag-of-words can be used to analyze the sentiment. A LDA model is such a model. Other researches use neural network to train a surprise model to analyze sentiment. Also, I find a blog which talks about using Python to do sentiment analysis:http://rzcoding.blog.163.com/

We can use the sentiment analysis to analyze customers’s attitude to the product and then improve the product. As I know, the series: house of cards is designed by this way. The company collected people’s comment online to other movies and series, and then calculate what the most famous element of films and series. after that, they combined these elements together and create the series.  Apparently, the show was a great success.


2014年9月21日星期日

About first three SMA classes

Hello, this blog is about my social media analysis course.

We have taken 3 lectures of Social Media Analysis so far. I think the key point of this course is Analytic. Lecture one tells us: we would use mathematics and computer tools to analyze the social network; Lecture two tells us content analysis and NLP; Lecture three tells us clustering. 

Lecture one:

I find some social media analysis pictures on the internet. The picture is shown below. On the pictures, a point is related to an entity and an edge represents a relationship. We can see that everyone has relationships with others and everyone can contact with anyone else through ones that have a relation with him. By modeling the social network by graph, we can calculate and obtain many valuable information. For example, we can know how important one is to the whole net, through counting how many lines connected to him, and so on.
Lecture two:

After learning NLP, I feel that one word has relationship with other words is amazing and language can be computed is really fascinating. Think about this: when you type one word, computer knows which word you will type next. This technology employs statistical knowledge to analyze text. We come to use big data to help us analyze text. One of the example of sentence analysis is showing below. Using NLP, we can divide the sentence into words and make part of speech tagging, recognize named-entities, parse the sentence, and so on. These basic tasks can be used in many applications, such as Search Engine, Speech Recognition, Optical Character Recognition, etc.
Lecture three:

The course is mainly about clustering. K-mean clustering is so useful and efficient. Prof. Chan gave us example about how to distinguish the smiling face and sad face. But when the coordinate is (1, 1) and (6, 6), we can not identify them. I search on the internet, and find that maybe the SVM can solve this problem. We can use the high- or infinite-dimensional space or more features to distinguish the faces. SVM is defined as below.

A support vector machine constructs a hyperplane or set of hyperplanes in a high- or infinite-dimensional space, which can be used for classification, regression, or other tasks. Intuitively, a good separation is achieved by the hyperplane that has the largest distance to the nearest training data point of any class (so-called functional margin), since in general the larger the margin the lower the generalization error of the classifier[1].

Reference:
[1]http://en.wikipedia.org/wiki/Support_vector_machine