Merge pull request #19 from A-Baji/dev

A-Baji · web-flow · commit 7035f0fb01bf · 2023-02-17T23:29:35.000-06:00
thought min max and bold prompts
diff --git a/README.md b/README.md
@@ -30,21 +30,21 @@ Pick a channel and user whose chat logs you want to use for creating your custom
 
 You can follow [this guide](https://turbofuture.com/internet/Discord-Channel-ID) to learn how to find a channel's ID. Make sure that you include the full username with the #id, and wrap it in quotes if it contains spaces. The `--dirty` flag prevents the outputted dataset files from being deleted. Downloaded chat logs get saved and reused, but you can set the `--redownload` flag if you want to update the logs.
 
-You may have noticed the lack of a model customization process occurring after running that command. This is because no base model was selected, but before you specify a base model, you should analyze the generated dataset located in the directory mentioned in the logs. Chat messages are parsed into a dataset by grouping individual messages sent within a certain timeframe into "thoughts", where each thought is a completion in the dataset. The default for this timeframe is 10 seconds. If your dataset looks a bit off, try different timeframe settings using the `-t` option: 
+You may have noticed the lack of a model customization process occurring after running that command. This is because no base model was selected, but before you specify a base model, you should analyze the generated dataset located in the directory mentioned in the logs. Chat messages are parsed into a dataset by grouping individual messages sent within a certain timeframe into "thoughts", where each thought is a completion in the dataset. The default for this timeframe is 10 seconds. The length of each thought must also be within the minimum and max thought length. The defaults for these are 4 words and `None`, or optional. If your dataset looks a bit off, try different settings using the `--ttime`, `--tmin`, and `--ttmax` options: 
 
-`discordai model create -c <channel_id> -u "<username#id>" -t <timeframe> --dirty`
+`discordai model create -c <channel_id> -u "<username#id>" --ttime <timeframe> --tmax <thought_max> --tmin <thought_min> --dirty`
 
-After you've found a good timeframe setting, you will want to manage your dataset's size. The larger your dataset, the more openAI credits it will cost to create a custom model. By default, the max dataset size is set to 1000. If your dataset exceeds this limit, it will be reduced using either a "first", "last", "middle", or "even" reduction method. The "first" method will select the first n messages, "last" will select the last n, "middle" will select the middle n, and "even" will select an even distribution of n messages. The default reduction method is even. You can set the max dataset size and reduction mode using the `-m` and `-r` options: 
+After you've found good thought settings, you will want to manage your dataset's size. The larger your dataset, the more openAI credits it will cost to create a custom model. By default, the max dataset size is set to 1000. If your dataset exceeds this limit, it will be reduced using either a "first", "last", "middle", or "even" reduction method. The "first" method will select the first n messages, "last" will select the last n, "middle" will select the middle n, and "even" will select an even distribution of n messages. The default reduction method is even. You can set the max dataset size and reduction mode using the `-m` and `-r` options: 
 
-`discordai model create -c <channel_id> -u "<username#id>" -t <timeframe> -m <max_size> -r <reduction_mode> --dirty`
+`discordai model create -c <channel_id> -u "<username#id>" --ttime <timeframe> --tmax <thought_max> --tmin <thought_min> -m <max_size> -r <reduction_mode> --dirty`
 
 If you are planning on creating multiple models, you may want to get your hands on multiple openAI API keys in order to maximize the free credit usage. You can assign specific api keys to custom models using the `-o` option. Otherwise, the key provided in your config will be used.
 
 Now that you have fine tuned your dataset, you can finally begin the customization process by specifying a base model. OpenAI has four base [models](https://beta.openai.com/docs/models/gpt-3): davinci, curie, babbage, and ada, in order of most advanced to least advanced. Generally you will want to use davinci, but it is also the most expensive model as well as the longest to customize. Select your base model with the `-b` option.
 
 Your final command should look something like this: 
 
-`discordai model create -c <channel_id> -u "<username#id>" -t <timeframe> -m <max_size> -r <reduction_mode> -b <base_model>`
+`discordai model create -c <channel_id> -u "<username#id>" --ttime <timeframe> --tmax <thought_max> --tmin <thought_min> -m <max_size> -r <reduction_mode> -b <base_model>`
 
 If you find the training step to cost too many credits with your current options, you can cancel it with `discordai job cancel -j <job_id>`, and then either lower your max dataset size, or choose a different discord channel and/or user. You can get a list of all your jobs with `discordai job list --simple`.
 ### Test the new model
diff --git a/discordai/command_line.py b/discordai/command_line.py
@@ -110,6 +110,13 @@ def discordai():
         dest='stop_default',
         help="Set the stop option to use for completions to True",
     )
+    new_cmd_optional_named.add_argument(
+        "--bolden",
+        action='store_true',
+        required=False,
+        dest='bolden',
+        help="Boldens the original prompt in the completion output",
+    )
 
     delete_cmd = bot_cmds_commands_subcommand.add_parser(
         "delete", description="Delete a slash command from your bot"
@@ -179,13 +186,29 @@ def discordai():
         help="The base model to use for customization. If none, then skips training step: DEFAULT=none",
     )
     model_create_optional_named.add_argument(
-        "-t", "--thought-time",
+        "--ttime", "--thought-time",
         type=int,
         default=10,
         required=False,
         dest='thought_time',
         help="The max amount of time in seconds to consider two individual messages to be part of the same \"thought\": DEFAULT=10",
     )
+    model_create_optional_named.add_argument(
+        "--tmax", "--thought-max",
+        type=int,
+        default=None,
+        required=False,
+        dest='thought_max',
+        help="The max in words length of each thought: DEFAULT=None",
+    )
+    model_create_optional_named.add_argument(
+        "--tmin", "--thought-min",
+        type=int,
+        default=4,
+        required=False,
+        dest='thought_min',
+        help="The minimum in words length of each thought: DEFAULT=4",
+    )
     model_create_optional_named.add_argument(
         "-m", "--max-entries",
         type=int,
@@ -301,6 +324,13 @@ def discordai():
         dest='openai_key',
         help="The openAI API key associated with the job to see the status for: DEFAULT=config.openai_key",
     )
+    job_status_optional_named.add_argument(
+        "--events",
+        action='store_true',
+        required=False,
+        dest='events',
+        help="Simplify the output to just the event list",
+    )
 
     job_cancel = job_subcommand.add_parser(
         "cancel", description="Cancel an openAI customization job"
@@ -351,17 +381,18 @@ def discordai():
         if args.subcommand == "commands":
             if args.subsubcommand == "new":
                 template.gen_new_command(args.model_id, args.command_name, args.temp_default, args.pres_default,
-                                         args.freq_default, args.max_tokens_default, args.stop_default, args.openai_key)
+                                         args.freq_default, args.max_tokens_default, args.stop_default, args.openai_key,
+                                         args.bolden)
             elif args.subsubcommand == "delete":
                 template.delete_command(args.command_name)
     elif args.command == "model":
         if args.subcommand == "list":
             openai_wrapper.list_models(args.openai_key, args.simple)
         if args.subcommand == "create":
             customize.create_model(config["token"], args.openai_key, args.channel, args.user,
-                                   thought_time=args.thought_time, max_entry_count=args.max_entries,
-                                   reduce_mode=args.reduce_mode, base_model=args.base_model, clean=args.dirty,
-                                   redownload=args.redownload)
+                                   thought_time=args.thought_time, thought_max=args.thought_max, thought_min=args.thought_min,
+                                   max_entry_count=args.max_entries, reduce_mode=args.reduce_mode, base_model=args.base_model, 
+                                   clean=args.dirty, redownload=args.redownload)
         if args.subcommand == "delete":
             openai_wrapper.delete_model(args.openai_key, args.model_id)
     elif args.command == "job":
@@ -370,7 +401,7 @@ def discordai():
         if args.subcommand == "follow":
             openai_wrapper.follow_job(args.openai_key, args.job_id)
         if args.subcommand == "status":
-            openai_wrapper.get_status(args.openai_key, args.job_id)
+            openai_wrapper.get_status(args.openai_key, args.job_id, args.events)
         if args.subcommand == "cancel":
             openai_wrapper.cancel_job(args.openai_key, args.job_id)
     elif args.command == "config":
diff --git a/discordai/template.py b/discordai/template.py
@@ -44,10 +44,10 @@ async def customai(self, context: Context, prompt: str = "", temp: float = {temp
                 frequency_penalty=presPen,
                 presence_penalty=freqPen,
                 max_tokens=max_tokens,
-                echo=True if prompt else False,
+                echo=False,
                 stop='.' if stop else None,
             )
-            await context.send(response[\'choices\'][0][\'text\'][:2000])
+            await context.send(f"{{'**' if {bold} else ''}}{{prompt}}{{'**' if {bold} else ''}}{{response[\'choices\'][0][\'text\'][:2000]}}")
         except Exception as error:
             print({error})
             await context.send(
@@ -63,7 +63,7 @@ async def setup(bot):
 
 
 def gen_new_command(model_id: str, command_name: str, temp_default: float, pres_default: float, freq_default: float,
-                    max_tokens_default: int, stop_default: bool, openai_key: str):
+                    max_tokens_default: int, stop_default: bool, openai_key: str, bold_prompt: bool):
     if getattr(sys, 'frozen', False):
         # The code is being run as a frozen executable
         data_dir = pathlib.Path(appdirs.user_data_dir(appname="discordai"))
@@ -84,7 +84,7 @@ def gen_new_command(model_id: str, command_name: str, temp_default: float, pres_
                 command_name=command_name, temp_default=float(temp_default),
                 pres_default=float(pres_default),
                 freq_default=float(freq_default),
-                max_tokens_default=max_tokens_default, stop_default=stop_default, openai_key=openai_key,
+                max_tokens_default=max_tokens_default, stop_default=stop_default, openai_key=openai_key, bold = bold_prompt,
                 error="f\"Failed to generate valid response for prompt: {prompt}\\nError: {error}\""))
         print(f"Successfully created new slash command: /{command_name} using model {model_id}")
 
diff --git a/discordai/version.py b/discordai/version.py
@@ -1 +1 @@
-__version__ = "1.2.1"
+__version__ = "1.3.0"
diff --git a/requirements.txt b/requirements.txt
@@ -2,4 +2,4 @@ discord.py
 openai
 pandas
 appdirs
-discordai_modelizer @ git+https://github.com/A-Baji/discordAI-modelizer.git@1.1.0
+discordai_modelizer @ git+https://github.com/A-Baji/discordAI-modelizer.git@1.2.0

Original file line number	Diff line number	Diff line change
`@@ -1 +1 @@`
`1`		`-__version__ = "1.2.1"`
	`1`	`+__version__ = "1.3.0"`