Com- prehension is seen as a transduction which transforms a linear structure, i.e. The de- velopment of such resources is a long and tiresome task, which generally requires an expertise on the field approached and knowledge in data-processing linguistics like techniques of filtering, categorization of documents and extraction of information. For this part we use a linguistic treatment which require lexicons and specialized grammar. Grammar is based on learning base, which contain about 2000 sentence for each mark’group (Dziczkowski and Wegrzyn-Wolska, 2007b). So we have obtained 5 different groups of film’s critics: a group of critics with score 1, 2. We regrouped all the critics by their mark. We used those data (critics, users, marks) to create our learning base. On different website we can find film critics with the mark assigned (e.g. To perform the critics marking we have to get a group of characteristic already evaluated - a learn- ing base. The work presented in this paper is foccused on the linguistic knowledge using linguistic resource described in section 2 (Cover, 1991), (Dave et al., 2000), (Pang and Lee, 2004), (Wang et al., 2003). Statistic classifier: Statistic research based on classifier of Bayes which is a categorizer of the probabilistic type founded on the theorem of Bayes.For entire corpus of reviews we calculate the distance of the characteristics of new reviews to the characteristics of the groups. The futures are for example: characteristic words, sentence length, corpus width, detection of negation, characteristics expressions, special and special punc- tuation. Statistic-linguistic classifier: Statistic researches on linguistic data for determine behaviour of reviews which have the same mark.Linguistic classifier: For each sentence of reviews we assign a rule of grammar that expresses inten- sity of opinion.For marking reviews we use three different ap- proaches which are as follows: The process of assignment of the mark into the critic is shown on figure 1. We use another classifier which will assign the final mark to the reviews based only on three marks get before from classifiers (Dziczkowski and Wegrzyn-Wolska, 2007a), (Dziczkowski and Wegrzyn-Wolska, 2007b). At the end we obtained three marks for one review which can be different. For each method we developed a classifier which sep- arately assigna a mark. These methods are based on different approach of corpus classification. We developed three different methods for assigning a mark to the reviews. This paper is focused on the marking critic’s module and more precisely of linguistic method of classifying the re- views. The principle tasks are: collecting the reviews from Internet, check- ing if the text found is a review, assigning a mark to the reviews and presentation of results. Currently lexicon-grammar is especially developed for the verbs and the predicative phrases (Tarveen and Hill, 2001) (Turney and Littman, 2003). A table corresponds to a particular syntactic construction and gathers all the words enter- ing this construction. It is orga- nized in groups of tables, which are associated with the syntactic category like full verbs, verbs supports, names, etc. The lexicon- grammar is a systematic description of the syntactic and semantic properties of the syntactic factors that is predicative verbs, nouns and adjectives. With Unitex we can build grammar from such tables. Each word having almost unique behaviour, the tables give the grammar of each element of the lexicon, which is why they are called lexicon-grammar tables. Tables of lexicon-grammar are matrixes that out- line the properties of all the simple verbs which are described by syntactic properties. The linguistic phenomena are represented by local grammar, and are then translated into finite state automaton in order to be easily confronted with the corpora of text. The corpora of text are represented by automats, in which each state corresponds to a lexical analysis. of lexical clarification were developed to implement grammatical constraints described before using this type of graph.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |