You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
# installer (c) Andreas Buerki 2015, licensed under the EUPL V.1.1.
5
+
version="0.1"
6
+
####
7
+
8
+
# define functions
9
+
help ( ) {
10
+
echo"
11
+
Usage: $(basename $0) [OPTIONS]
12
+
Example: $(basename $0) -u
13
+
IMPORTANT: this script should not be moved outside of its original directory.
14
+
(it will stop working if it is moved)
15
+
Options: -u uninstalls the software
16
+
-V displays version information
17
+
-p only attempts to set path
18
+
"
19
+
}
20
+
21
+
# analyse options
22
+
whilegetopts hpuV opt
23
+
do
24
+
case$optin
25
+
h) help
26
+
exit 0
27
+
;;
28
+
u) uninstall=true
29
+
;;
30
+
p) pathonly=true
31
+
;;
32
+
V) echo"$(basename $0) - version $version"
33
+
echo"Copyright (c) 2015 Andreas Buerki"
34
+
echo"licensed under the EUPL V.1.1"
35
+
exit 0
36
+
;;
37
+
esac
38
+
done
39
+
echo""
40
+
echo"Installer"
41
+
echo"---------"
42
+
echo""
43
+
# check if there is a space in home:
44
+
if [ "$(grep -o ''<<<$HOME)" ];then
45
+
echo"The home directory contains a space: $HOME">&2
46
+
echo"This will cause problems during installation.">&2
47
+
echo"See https://www.cygwin.com/ml/cygwin/2007-09/msg00423.html on how to change this in Cygwin.">&2
48
+
echo"Installation aborted">&2
49
+
exit 1
50
+
fi
51
+
52
+
# check what platform we're under
53
+
platform=$(uname -s)
54
+
# and make adjustments accordingly
55
+
if [ "$(grep 'CYGWIN'<<<$platform)" ];then
56
+
sourcedir="$0"
57
+
else
58
+
sourcedir="$(dirname $0)"
59
+
fi
60
+
# check it's in its proper directory
61
+
if [ "$(grep 'SubString'<<<"$sourcedir")" ];then
62
+
:
63
+
elif [ "$sourcedir"=="." ];then
64
+
sourcedir=$(pwd)
65
+
if [ "$(grep 'SubString'<<<$sourcedir)" ];then
66
+
:
67
+
else
68
+
echo"This installer script appears to have been moved out of its original directory. Please move it back into its original directory and run it again.">&2
69
+
exit 1
70
+
fi
71
+
else
72
+
echo"This installer script appears to have been moved out of its original directory. Please move it back into its original directory and run it again.">&2
73
+
exit 1
74
+
fi
75
+
76
+
77
+
# set path
78
+
# echo "current path: $PATH"
79
+
# work out if $HOME has a space and fix accordingly
Copy file name to clipboardExpand all lines: README.md
+34-17Lines changed: 34 additions & 17 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,5 +1,5 @@
1
1

2
-
SubString v0.9.5
2
+
SubString v0.9.6
3
3
================
4
4
5
5
The SubString package is a set of Unix shell scripts used to consolidate frequencies of word n-grams of various different n (i.e. word n-grams of different lengths). In the process, the frequencies of substrings are reduced by the frequencies of their superstrings and a consolidated list with n-grams of different length is produced without an inflation of the overall count. The functions performed by this package will primarily be of interest to linguists and computational linguists working on formulaic language, multi-word sequences and other phraseological phenomena.
@@ -45,37 +45,45 @@ The current release of the SubString package contains the following components:
45
45
*`test_data` a directory containing test data
46
46
47
47
*`EUPL.pdf` a copy of the European Union Public License under which SubString is licensed.
48
+
*`OSX_installer.command` double-clickable installer for OS X
48
49
50
+
*`linux_installer.desktop` double-clickable installfer for Linux
49
51
50
-
C. Installation
51
-
---------------
52
-
53
-
SubString was tested on MacOS X (v. 10.8 and 10.9), Ubuntu Linux (version Xubuntu 14.04) and Cygwin (version 1.7.30), but should run on all platforms on which a bash shell is installed. This includes Windows with the [Cygwin](cygwin.com) package installed. For efficient processing of larger amounts of data, bash v. 4 is necessary (although the software will substitute a slower algorithm if only bash v. 3 is available).[^1]
52
+
C. Compatible Systems
53
+
------
54
+
SubString was tested on OS X (v. 10.8 and 10.9), Ubuntu Linux (version Xubuntu 14.04) and Cygwin (version 1.7.30), but should run on all platforms on which a bash shell is installed. This includes Windows with the [Cygwin](cygwin.com) package installed. For efficient processing of larger amounts of data, bash v. 4 is necessary (although the software will substitute a slower algorithm if only bash v. 3 is available).[^1]
54
55
[^1]: Most recent operating system versions have bash v. 4 installed as standard, but MacOS X has bash v. 3.2 installed as standard. Bash v. 4 can be installed using [MacPorts](http://www.macports.org), [Homebrew](http://brew.sh) or similar and then the new version would either need to be put in the directory `/bin` (replacing the old version) or the first line of the `substring.sh` script would need adjusting to point to the new version of bash (if installed via MacPorts, the new line would read `#!/opt/local/bin/bash` instead of `#!/usr/bin/env bash`).
55
56
57
+
58
+
D. Installation
59
+
---------------
60
+
56
61
Generally, all scripts (i.e. the files ending in .sh) should be placed in a location that is in the user's $PATH variable (or the location should be added to the $PATH variable) so they can be called from the command line. A good place to put the scripts might be /usr/local/bin or $HOME/bin.
57
62
58
-
Detailed instructions of how to do this are given here:
63
+
For OS X and Linux, an installer is provided that takes care of these installation steps. Inside the SubString directory, double-click on `linux_installer` (for Linux) or `OSX_installer` (OS X). This replaces previous versions of the installed files. It may be necessary to log out and log in again before the installation takes effect. The success of the installation can be verified by opening a terminal window and typing the following: `substring.sh -h`. A help message should be displayed. If the installation was successful, the rest of this section can be skipped. Depending on the particular system setup, automatic installation might fail. In this case, or if running on Cygwin, the following manual installation instructions should be followed.
59
64
60
-
1. open the Terminal application
65
+
Detailed instructions for manual installation:
61
66
62
-
MacOS X: in Applications/Utilities
67
+
1. open the Terminal application
68
+
69
+
OS X: in Applications/Utilities
63
70
64
71
Ubuntu Linux: via menu Applications>Accessories>Terminal
65
72
66
73
Cygwin: via the link on the desktop to Cygwin Terminal
67
-
2. type: `mkdir /usr/local/bin` (it may say 'File exists', that's fine)
68
-
3. type: `echo $PATH` (if you can see /usr/local/bin somewhere in the
69
-
output, move to step 8, if not carry on with the next step)
70
-
4. type: `cd $HOME`
74
+
2. type: `mkdir $HOME/bin` (it may say 'File exists', that's fine)
75
+
3. type: `echo $PATH` (if you can see /User/YOURNAME/bin somewhere in the
76
+
output, move to step 9, if not carry on with the next step)
77
+
4. type: `cd $HOME` and then type `ls .bash_profile` (including the period). If this does NOT produce an error, in the following steps, always use `.bash_profile` where it says `.profile`. If there is an error, the instructions in the following steps can be followed exactly as they are written.
78
+
5. check if the file .bash_profile is present
71
79
type: `cp .profile .profile.bkup` (if it says there no such file,
72
80
that's fine)
73
81
5. type: `vi .profile`
74
82
6. move to an empty line and press the `i` key, then enter the
75
-
following: `PATH=/usr/local/bin:$PATH`
83
+
following: `PATH=$HOME/bin:$PATH`
76
84
7. press ESC, then type `:wq!`
77
85
8. move into the SubString directory. This can be done by typing `cd ` (make sure there is a space after `cd ` but don't press return yet) and then dragging the SubString folder onto the Terminal window and pressing return.
78
-
9. type: `sudo cp *.sh /usr/local/bin` (you will need to enter an admin password)
86
+
9. type: `cp *.sh $HOME/bin` (you will need to enter an admin password)
79
87
80
88
Done!
81
89
@@ -89,10 +97,19 @@ The installation can be verified by calling each script's help function for the
89
97
90
98
For further tests, you may wish to run SubString on the test data (see next section)
91
99
100
+
To uninstall, use the terminal to move into the SubString directory (type `cd ` followed by a space, then drop the Substring-X.X.X directory on to the terminal window), then type `OSX_installer.command -u` (for Linux or OS X). Alternatively, manually delete the relevant files from the directory `/Users/YOURNAME/bin` where YOURNAME is the user name.
92
101
93
102
D. Operation
94
103
------------
95
104
105
+
Open the Terminal application if not already open:
106
+
107
+
* OS X: in Applications/Utilities
108
+
109
+
* Ubuntu Linux: via menu Applications>Accessories>Terminal
110
+
111
+
* Cygwin: via the link on the desktop to Cygwin Terminal
112
+
96
113
97
114
**LISTCONV.SH**
98
115
@@ -106,8 +123,8 @@ n·gram· 1[ 1]
106
123
107
124
That is, an n-gram (with constituents either delimited by diamonds (<>) or the unicode character interpunct (middle dot)), followed by a tab and the frequency count, optionally followed by another tab and a document count. The listconv.sh script can be used to convert n-gram lists into this format. listconv.sh is able to convert output lists created with the [N-Gram Processor](http://buerki.github.io/ngramprocessor), the [Ngram Statistics Package](http://ngram.sourceforge.net), the [NGramTools](http://homepages.inf.ed.ac.uk/lzhang10/ngram.html) or n-grams lists provided by the [Google Books corpus](http://storage.googleapis.com/books/ngrams/books/datasetsv2.html), although the latter will need previous selection of the relevant data as the same n-grams are listed for many different years.
108
125
109
-
To convert an n-gram list, simply supply the names of the files to
110
-
be converted as arguments:
126
+
To convert an n-gram list, supply the names of the files to
127
+
be converted as arguments: (open a terminal window if now yet open)
111
128
112
129
listconv.sh FILE+
113
130
@@ -282,7 +299,7 @@ None reported at this time. Issues can be raised at [http://github.com/buerki/Su
282
299
F. Copyright, licensing, download
283
300
---------------------------------
284
301
285
-
SubString is (c) 2011-2014 Andreas Buerki, licensed under the EUPL V.1.1. (the European Union Public Licence) which is an open-source licence (see the EUPL.pdf file for the full licence).
302
+
SubString is (c) 2011-2015 Andreas Buerki, licensed under the EUPL V.1.1. (the European Union Public Licence) which is an open-source licence (see the EUPL.pdf file for the full licence).
286
303
287
304
The project resides at [http://buerki.github.com/SubString/](http://buerki.github.com/SubString/) and new versions will be posted there. Suggestions and feedback are welcome. To be notified of new releases, go to https://github.com/buerki/SubString, click on the 'Watch' button and sign in.
0 commit comments