This repository has been archived by the owner on Nov 3, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Image features in polyencoder #2412
Merged
Merged
Changes from 3 commits
Commits
Show all changes
115 commits
Select commit
Hold shift + click to select a range
86b0ea3
Start to pass through image features
EricMichaelSmith 109733c
Work on image features in polyencoder
EricMichaelSmith e4929bf
Work on revamping
EricMichaelSmith 744f4ca
Simplify image encoding
EricMichaelSmith 5584ac9
Tweax
EricMichaelSmith bf36218
Prepend mode
EricMichaelSmith fb9f38e
Add mode
EricMichaelSmith 744a33a
Flag fix
85cc617
Minor fixes
0043ea4
Setup for merged encoder
EricMichaelSmith f3430d1
Merge branch 'polyencoder-with-image' of github.com:facebookresearch/…
EricMichaelSmith 3990ebc
Getting new encoder to match old
EricMichaelSmith ba9188d
Remove unneeded things
EricMichaelSmith 09a8b4c
Revert a bit
EricMichaelSmith 92b73c9
Multiple trivial reduction types
EricMichaelSmith 1dd717d
Fix pos
EricMichaelSmith e88c411
Add postpend mode
EricMichaelSmith bf32454
Change param name back
EricMichaelSmith c49b7cb
Rip out old code
EricMichaelSmith 8627c72
New cat()
EricMichaelSmith 29d390a
Smart add
EricMichaelSmith e7f4108
Fix forward()
EricMichaelSmith 1896dd3
Comment
EricMichaelSmith 54af92f
Minor
EricMichaelSmith 5990018
Minor
EricMichaelSmith 8d4fdfc
Finally, merge in new ContextWithImage module
EricMichaelSmith 172ec95
Import bugs
431104d
Fixes
225f90d
Breakpoint
3ae3b08
Minor, but maybe will help test to pass
e7f1283
Remove breakpoint
fd5fd8d
Dealing with new params
dad3b36
DataParallel fix
1cc2a01
Register untrained params as buffers
b17cc0f
Add persona+WoW teachers
a00f01e
Ugh handle no image
730ad07
Fill in Nones
2a01122
Cleanup and enforce always having image features
929b20a
Let's be tough on tensors
b82678a
Always use dummy features
00f42d2
Typo
2d3bf27
Delete additional teachers for now
EricMichaelSmith 6c146d6
postpend -> append
EricMichaelSmith f0c1e1f
Private methods
EricMichaelSmith f5d489c
Simplify
EricMichaelSmith 431793f
Dump ImagePolyencoderAgent
EricMichaelSmith b954cdd
Split functionality
EricMichaelSmith a7748b4
Remove duplicated methods
EricMichaelSmith d0bd597
Split remaining Agent code
EricMichaelSmith bec4d63
Move Agent
EricMichaelSmith e6a5e11
Dump new model
EricMichaelSmith 54f6377
Refer to new model
EricMichaelSmith 0373ad5
Don't duplicate code
EricMichaelSmith b2cbefe
Start to separate model
EricMichaelSmith 683ac81
Start to split .encode()
EricMichaelSmith d1b0bbc
More DRY
EricMichaelSmith 355fcf6
End of splitting
EricMichaelSmith 33fc1fb
Remove flag
EricMichaelSmith 51f96a1
Tweak defaults
EricMichaelSmith bbd6460
Yes - added
EricMichaelSmith ca03db3
warn_once
EricMichaelSmith ddc6f02
Remove unneeded giant function
EricMichaelSmith 1b266a7
Cosmetic
EricMichaelSmith 4d2f125
Avoid a lot of repeated code
EricMichaelSmith 5eb08fe
Revert old code
EricMichaelSmith e3b1f7a
Name change
EricMichaelSmith b2b8f72
Update docstring
EricMichaelSmith 56608ab
Split into separate module
EricMichaelSmith 9c35126
Fix kwarg issue
EricMichaelSmith 8c8c799
Change a variable named "dict"
6ee97b3
Reduction type fix
3f9272f
No, have to use kwargs
b85b06f
Fix getattr() bug
8cf287b
Same trick for the context encoder
9e00af8
Use right encoder types
b98194c
Remove unneeded method
2d19a3a
Autoformat
5a95902
Multi-layer image encoder fix
139c6cf
TorchImageAgent
EricMichaelSmith 32a1b26
Merge branch 'master' into polyencoder-with-image
bdf02d1
Fix adding args
de1b6b1
CPU fix
f9acfc9
Allow multiple image tokens
aed9983
Formatting
7cc5822
Name consistency
f329f56
Partial fix for token-dim issue
e65c426
Rest of fp16 function
b0931e3
Dump previous test
EricMichaelSmith 8442ce4
Revamp basic task
EricMichaelSmith fa301e9
Set up image polyencoder test
da37eea
Change output message
2b1897a
Minor test tweaks
76ac292
Bad mode
4028578
Token fix
1ad5826
Fix
a0a70e5
Deal with no image features
61a9be6
Merge branch 'master' into polyencoder-with-image
EricMichaelSmith 003f302
Segments fix
75c2b96
Fix test
81ceabf
autoformat
2213d42
Add commit string
EricMichaelSmith f5f58d8
Silently delete param
EricMichaelSmith adc95a1
Minor
EricMichaelSmith aa0bf20
mypy
EricMichaelSmith b08b979
Overhaul task
c1f549b
More tasks
eb54800
fp16 in new function
ff7dd95
Minor
0f6f4fc
Cosmetic
9737599
More description
9e5049b
[long] Assert
ed1d59c
mypy tweaks
7b4fd65
Make tests faster
6d0b994
Increase time image test has
b053eb2
Make test easier
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -42,6 +42,15 @@ def add_cmdline_args(cls, argparser): | |
choices=['add', 'append', 'prepend'], | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. once we have a solid grasp of which method is best, let's add a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Added TODO for this |
||
help='How to combine image embedding (if used) with context embedding', | ||
) | ||
agent.add_argument( | ||
'--n-image-tokens', | ||
type=int, | ||
default=1, | ||
help=( | ||
'Number of tokens that the image encoding will consist of (when adding ' | ||
'or prepending)' | ||
), | ||
) | ||
agent.set_defaults(reduction_type=None) | ||
# This agent doesn't support any encoder output reductions | ||
return agent | ||
|
@@ -162,6 +171,7 @@ def get_encoder(self, opt, dict_, null_idx, reduction_type, for_context: bool): | |
image_encoder_num_layers=opt['image_encoder_num_layers'], | ||
image_features_dim=opt['image_features_dim'], | ||
image_combination_mode=opt['image_combination_mode'], | ||
n_image_tokens=opt['n_image_tokens'], | ||
) | ||
else: | ||
# The candidate encoder is the same as for PolyEncoderModule | ||
|
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you use in practice? This looks like it could cause that multiple-of-8 issue I was mentioning
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, my time steps definitely aren't multiples of 8. Given what you said, I'm sure that's causing a major slowdown - I first have to address some issues with the training itself not going as expected (due to image features not being given weight when ImageChat personalities are also present), but then I'll take a look at this :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we don't end up seeing any gain from varying this I would say we can just keep it at 1 and reduce some unnecessary complexity later
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm - so, preliminary sweeps show that it might be giving us a few-point boost, presumably due to the image features just taking up more space in the final encoding. I'd prefer to keep it in unless you all really hate it :)