-
Notifications
You must be signed in to change notification settings - Fork 16
/
Copy pathLOG_build
152 lines (135 loc) · 5.27 KB
/
LOG_build
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
get index
Input: none
Output:$(MAIL)/questions.term
Eshell V5.9 (abort with ^G)
1> Written: /Users/joe/.sherlock/mails/questions.html
1> Written: /Users/joe/.sherlock/mails/questions.term
1> Years = ["1997","1998","1999","2000","2001","2002","2003","2004","2005",
"2006","2007","2008","2009","2010","2011","2012","2013"]
1> Parsing mails for: 2009
1> Parsing: 2009-January.txt.gz
1> Parsing: 2009-February.txt.gz
1> Parsing: 2009-March.txt.gz
1> Parsing: 2009-April.txt.gz
1> Parsing: 2009-May.txt.gz
1> Parsing: 2009-June.txt.gz
1> Parsing: 2009-July.txt.gz
1> Parsing: 2009-August.txt.gz
1> Parsing: 2009-September.txt.gz
1> Parsing: 2009-October.txt.gz
1> Parsing: 2009-November.txt.gz
1> Parsing: 2009-December.txt.gz
1> Written: /Users/joe/.sherlock/mails/2009/parsed.bin
1> Year: 2009 #entries = 7906 size = 3.56 Megabytes average =450.34 bytes/entry
1> Computing mail IDF for: 2009
1> Adding synthetic keywords for:2009
1> Written binary store:/Users/joe/.sherlock/mails/2009/mails.bin
1> Written listing:"/Users/joe/.sherlock/mails/2009/mails.list"
1> 946: UBF and JSON Protocols
1> Query took:22 ms #results=1
1> ----
ID: 946
Date: Sun, 15 Feb 2009 12:39:10 +0100
From: Joe Armstrong
Subject: UBF and JSON Protocols
For a long time I have been interested in describing protocols. In
2002 I published a contract system called UBF for defining protocols.
This scheme was never widely adopted - perhaps it was just to
strange...
I have revised UBF and recast it in a form which I call JSON Protocols
- since JSON is widely implemented, this method of described protocols
might be more acceptable...
Read the remainder of this on
http://armstrongonsoftware.blogspot.com/2009/02/json-protocols-part-1.html
/Joe Armstrong
1> ** searching for a mail in 2009 similar to the content of file:./src/sherlock_tfidf.erl
1> Searching for=[<<"idf">>,<<"word">>,<<"remove">>,<<"words">>,<<"tab">>,
<<"duplicates">>,<<"ets">>,<<"keywords">>,<<"bin">>,<<"skip">>,
<<"file">>,<<"index">>,<<"binary">>,<<"frequency">>,<<"dict">>]
1> 7260 : 0.27 Word Frequency Analysis
1> 7252 : 0.27 Word Frequency Analysis
1> 7651 : 0.18 tab completion and word killing in the shell
1> 4297 : 0.17 ets vs process-based registry + local vs global dispatch
1> 5324 : 0.16 ets memory usage
1> 5325 : 0.15 ets memory usage
1> 1917 : 0.14 A couple of design questions
1> 1860 : 0.12 leex and yecc spotting double newline
1> 5361 : 0.11 dict slower than ets?
1> 1991 : 0.11 Extending term external format to support shared substructures
1> ----
ID: 7260
Date: Fri, 04 Dec 2009 17:57:03 +0100
From: =?ISO-8859-15?Q?Johann_H=F6chtl?=
Subject: Word Frequency Analysis
Hello!
I need to compute a word frequency analysis of a fairly large corpus. At
present I discovered the disco database
http://discoproject.org/
which seems to include a tf-idf indexer. What about couchdb? I found an
article that it fails rather quickly (somewhere between 100 and 1000
wikipedia text pages)
http://knuthellan.com/2009/07/09/the-couchdb-indexer-lightweight-search-engine-in-hours/
Are there other erlang frameworks or can somebody provide me with a hint
to another DBM system which naturally supports wortd frequncy analysis?
Thank you!
Regards,
Johann
1> Searching for a mail in 2009 similar to mail number 7260 in 2009
1> Searching for=[<<"indexer">>,<<"analysis">>,<<"couchdb">>,<<"wortd">>,
<<"idf">>,<<"knuthellan.com">>,<<"frequncy">>,<<"corpus">>,
<<"dbm">>,<<"discoproject.org">>,<<"disco">>]
1> 7260 : 0.84 Word Frequency Analysis
1> 7252 : 0.84 Word Frequency Analysis
1> 6844 : 0.21 couchdb in Karmic Koala
1> 6848 : 0.21 couchdb in Karmic Koala
1> 6847 : 0.20 couchdb in Karmic Koala
1> 6849 : 0.19 couchdb in Karmic Koala
1> 7264 : 0.17 Re: erlang search engine library?
1> 6843 : 0.16 couchdb in Karmic Koala
1> 2895 : 0.15 CouchDB integration
1> 69 : 0.14 dialyzer fails when using packages and -r
1> ----
ID: 6844
Date: Thu, 19 Nov 2009 17:26:46 +0300
From: Dmitry Belyaev
Subject: couchdb in Karmic Koala
1) ps -A | grep couchdb
or
2) /etc/init.d/couchdb {start|stop|status}
3) sure you can if you database name will not interfere any other
As I remember couchdb package depends on erlang. So it may be installed
as always in /usr/bin and /usr/lib/erlang
On Thu, 2009-11-19 at 15:06 +0100, Joe Armstrong wrote:
> I just upgraded my ubuntu to Karmic Koala
>
> couchdb seems to have installed itself and a few other things (horray
> - well done ...)
>
> $ pwd
> /usr/lib/couchdb/erlang/lib
> $ ls
> couch-0.10.0 erlang-oauth etap ibrowse-1.5.2 mochiweb-r97
>
> I think couchdb is running on my machine.
>
> A few questions:
>
> 1) Is couchdb running on my machine and how can I confirm this?
> 2) How is couchdb started and stopped?
> 3) Can I use couchdb for my own applications of is it reserved for system use?
> 4) Where is Erlang hidden away?
> 5) Can I use the hidden Erlang for my own applications
> 6) Can I distribute my own Erlang applications that make use of the
> Erlang that has (apparently) been installed in Karmic Koala
>
> Having Erlang on all Karmic Koala machines could lead to many exciting
> things :-)
>
> /Joe
>
> ________________________________________________________________
> erlang-questions mailing list. See http://www.erlang.org/faq.html
> erlang-questions (at) erlang.org
>
>
1>