Skip to content

Commit e21ee28

Browse files
Merge pull request #86 from notdanhan/master
Stream Mapping mode
2 parents 307aca8 + c9f2a0e commit e21ee28

File tree

15 files changed

+914
-527
lines changed

15 files changed

+914
-527
lines changed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@ helix-core-api/
66
.DS_Store
77
.vscode/
88
*.zip
9+
compile_commands.json
910

1011
# Depot conversion results
1112
clones/

CMakeLists.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@ option(BUILD_TESTS "Build tests" OFF)
55

66
set(CXX_STANDARD_REQUIRED true)
77
set(CMAKE_CXX_STANDARD 11)
8+
set(CMAKE_EXPORT_COMPILE_COMMANDS ON)
89

910
project(
1011
p4-fusion

README.md

Lines changed: 43 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,12 @@ These execution times are expected to scale as expected with larger depots (mill
2828
## Usage
2929

3030
```shell
31-
[ PRINT @ Main:56 ] Usage:
31+
[ PRINT @ Main:59 ] Usage:
32+
--branch [Optional, Default is empty]
33+
A branch to migrate under the depot path. May be specified more than once. If at least one is given and the noMerge option is false, then the Git repository will include merges between branches
34+
in the history. You may use the formatting 'depot/path:git-alias', separating the Perforce branch sub-path from the git alias name by a ':'; if the depot path contains a ':', then you must provide
35+
the git branch alias.
36+
3237
--client [Required]
3338
Name/path of the client workspace specification.
3439

@@ -44,12 +49,6 @@ These execution times are expected to scale as expected with larger depots (mill
4449
--lookAhead [Required]
4550
How many CLs in the future, at most, shall we keep downloaded by the time it is to commit them?
4651

47-
--branch [Optional]
48-
A branch to migrate under the depot path. May be specified more than once. If at least one is given and the noMerge option is false, then the Git repository will include merges between branches in the history. You may use the formatting 'depot/path:git-alias', separating the Perforce branch sub-path from the git alias name by a ':'; if the depot path contains a ':', then you must provide the git branch alias.
49-
50-
--noMerge [Optional, Default is false]
51-
When false and at least one branch is given, then . If this is true, then the Git history will not contain any merges, except for an artificial empty commit added at the root, which acts as a common source to make later merges easier.
52-
5352
--maxChanges [Optional, Default is -1]
5453
Specify the max number of changelists which should be processed in a single run. -1 signifies unlimited range.
5554

@@ -59,8 +58,11 @@ These execution times are expected to scale as expected with larger depots (mill
5958
--noColor [Optional, Default is false]
6059
Disable colored output.
6160

61+
--noMerge [Optional, Default is false]
62+
Disable performing a Git merge when a Perforce branch integrates (or copies, etc) into another branch.
63+
6264
--path [Required]
63-
P4 depot path to convert to a Git repo
65+
P4 depot path to convert to a Git repo. If used with '--branch', this is the base path for the branches.
6466

6567
--port [Required]
6668
Specify which P4PORT to use.
@@ -77,6 +79,9 @@ These execution times are expected to scale as expected with larger depots (mill
7779
--src [Required]
7880
Relative path where the git repository should be created. This path should be empty before running p4-fusion for the first time in a directory.
7981

82+
--streamMappings [Optional, Default is false]
83+
Use Mappings defined by Perforce Stream Spec for a given stream
84+
8085
--user [Required]
8186
Specify which P4USER to use. Please ensure that the user is logged in.
8287
```
@@ -95,6 +100,33 @@ Because Perforce integration isn't a 1-to-1 mapping onto Git merge, there can be
95100
96101
If the Perforce tree contains sub-branches, such as `//base/tree/sub` being a sub-branch of `//base/tree`, then you can use the arguments `--path //base/... --branch tree/sub:tree-sub --branch tree`. The ordering is important here - provide the deeper paths first to have them take priority over the others. Because Git creates branches with '/' characters as implicit directories, you must provide the Git branch alias to prevent Git reporting an error where the branch "tree" can't be created because is already a directory, or "tree/sub" can't be created because "tree" isn't a directory.
97102
103+
## Notes on stream mapping mode
104+
105+
Stream Mapping mode is disabled by default, it makes it so that Perforce Stream Views and Paths [link](https://www.perforce.com/manuals/p4guide/Content/P4Guide/streams.paths.html) are treated as part of the of the depot in which the stream view is mapped. This can include completely out-of-tree depots with no shared path deeper than `//...` it commits everything to the master branch (Using it with branching mode is untested)
106+
lets assume that you have some depots
107+
```
108+
//A/foo/...
109+
//B/bar/...
110+
//C/baz/...
111+
```
112+
and you have some (in perforce) read only monorepo stream with the following mappings
113+
```
114+
import a/... //A/foo/...
115+
import b/... //B/bar/...
116+
import c/... //C/baz/...
117+
```
118+
119+
This will also recursively traverse those depots/streams (you can even just map in a single file and it's fine.) and map in their contents
120+
121+
If you were to run p4-fusion without stream-mapping mode you would be given what could possibly be an empty repo. This will mangle paths to merge all of those imports and treat it as one git repo (almost) seamlessly.
122+
123+
A side effect of stream mapping mode is that you can use stream mappings to exclude paths that no longer exist, and as a result their contents are ignored from git history.
124+
125+
### limitations
126+
127+
- If there is a cycle in the mapping graph this mode will break.
128+
- If the contents of a subdirectory were at one stage part of the parent stream and are now being mapped in (or vice versa) there is a possibility that the files may not exist in the git history. (this is a file-by-file basis if you have several paths mapped into one directory and have files in the parent stream in the same directory, it will be fine provided the file names are all unique to that stream.)
129+
98130
## Checking Results
99131
100132
In order to test the validity of the logic, we need to run the program over a Perforce depot and compare each changelist against the corresponding Git commit SHA, to ensure the files match up.
@@ -106,12 +138,13 @@ Because of the extra effort the script performs, expect it to take orders of mag
106138
## Build
107139
108140
0. Pre-requisites
109-
* Install openssl@1.0.2t at `/usr/local/ssl` by following the steps [here](https://askubuntu.com/a/1094690).
141+
* Install openssl (both 1.1.1 and 3 work)
110142
* Install CMake 3.16+.
111143
* Install g++ 11.2.0 (older versions compatible with C++11 are also supported).
112144
* Clone this repository or [get a release distribution](https://github.com/salesforce/p4-fusion/releases).
113145
* Get the Helix Core C++ API binaries from the [official Perforce website](https://www.perforce.com/downloads/helix-core-c/c-api).
114-
* Tested versions: 2021.1, 2021.2, 2022.1
146+
* If you are using Openssl3, As of October 12th 2024, you need to manually access the FTP site [here](https://cdist2.perforce.com/perforce/r24.1/bin.linux26x86_64/).
147+
* Tested versions: 2021.1, 2021.2, 2022.1, 2024.1
115148
* We recommend always picking the newest API versions that compile with p4-fusion.
116149
* Extract the contents in `./vendor/helix-core-api/linux/` or `./vendor/helix-core-api/mac/` based on your OS.
117150

p4-fusion/branch_set.cc

Lines changed: 56 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -106,8 +106,10 @@ std::vector<Branch> createBranchesFromPaths(const std::vector<std::string>& bran
106106
return parsed;
107107
}
108108

109-
BranchSet::BranchSet(std::vector<std::string>& clientViewMapping, const std::string& baseDepotPath, const std::vector<std::string>& branches, const bool includeBinaries)
109+
BranchSet::BranchSet(std::vector<std::string>& clientViewMapping, const std::string& baseDepotPath, const std::vector<std::string>& branches, const std::vector<StreamResult::MappingData>& mappings, const std::vector<StreamResult::MappingData>& exclusions, const bool includeBinaries)
110110
: m_branches(createBranchesFromPaths(branches))
111+
, m_mappings(mappings)
112+
, m_exclusions(exclusions)
111113
, m_includeBinaries(includeBinaries)
112114
{
113115
m_view.InsertTranslationMapping(clientViewMapping);
@@ -219,11 +221,62 @@ std::unique_ptr<ChangedFileGroups> BranchSet::ParseAffectedFiles(const std::vect
219221
{
220222
continue;
221223
}
224+
// Put logic for Absolutely do not dare map these files here, here :)
222225
std::string relativeDepotPath = stripBasePath(depotFile);
223226
if (relativeDepotPath.empty())
224227
{
225-
// Not under the depot path. Shouldn't happen due to the way we
226-
// scan for files, but...
228+
// Not under regular depot path, might be mapped in
229+
// check and attenuate if it is.
230+
bool isImport = false;
231+
for (auto const& v : m_mappings)
232+
{
233+
if (STDHelpers::EndsWith(v.stream2, "..."))
234+
{
235+
// get rid of trailing ...
236+
auto tempStr = v.stream2.substr(0, v.stream2.size() - 3);
237+
if (STDHelpers::StartsWith(depotFile, tempStr))
238+
{
239+
isImport = true;
240+
// Shove the replacement path at the front
241+
relativeDepotPath = v.stream1.substr(0, v.stream1.size() - 3) + depotFile.substr(tempStr.size());
242+
break;
243+
}
244+
if (depotFile == tempStr)
245+
{
246+
// Map in exactly one file and nothing else.
247+
relativeDepotPath = v.stream1;
248+
isImport = true;
249+
break;
250+
}
251+
}
252+
}
253+
254+
if (!isImport)
255+
{
256+
continue;
257+
}
258+
}
259+
260+
// Check the file or path is not marked as excluded (There is a chance that not all of a mapped in directory is desired, so we have to check post mapping.)
261+
bool discard = false;
262+
for (auto const& v : m_exclusions)
263+
{
264+
if (STDHelpers::EndsWith(v.stream1, "..."))
265+
{
266+
if (STDHelpers::StartsWith(relativeDepotPath, v.stream1.substr(0, v.stream1.size() - 3)))
267+
{
268+
discard = true;
269+
break;
270+
}
271+
}
272+
else if (v.stream1 == relativeDepotPath)
273+
{
274+
discard = true;
275+
break;
276+
}
277+
}
278+
if (discard)
279+
{
227280
continue;
228281
}
229282

p4-fusion/branch_set.h

Lines changed: 101 additions & 98 deletions
Original file line numberDiff line numberDiff line change
@@ -1,98 +1,101 @@
1-
/*
2-
* Copyright (c) 2022 Salesforce, Inc.
3-
* All rights reserved.
4-
* SPDX-License-Identifier: BSD-3-Clause
5-
* For full license text, see the LICENSE.txt file in the repo root or https://opensource.org/licenses/BSD-3-Clause
6-
*/
7-
#pragma once
8-
9-
#include <string>
10-
#include <vector>
11-
#include <array>
12-
#include <memory>
13-
#include <stdexcept>
14-
15-
#include "commands/file_map.h"
16-
#include "commands/file_data.h"
17-
#include "utils/std_helpers.h"
18-
19-
struct BranchedFileGroup
20-
{
21-
// If a BranchedFiles collection hasSource == true,
22-
// then all files in this collection MUST be a merge
23-
// from the given source branch to the target branch.
24-
// These branch names will be the Git branch names.
25-
std::string sourceBranch;
26-
std::string targetBranch;
27-
bool hasSource;
28-
std::vector<FileData> files;
29-
30-
// Get all the relative file names from each of the file data.
31-
std::vector<std::string> GetRelativeFileNames();
32-
};
33-
34-
struct ChangedFileGroups
35-
{
36-
private:
37-
ChangedFileGroups();
38-
39-
public:
40-
std::vector<BranchedFileGroup> branchedFileGroups;
41-
int totalFileCount;
42-
43-
// When all the file groups have finished being used,
44-
// only then can we safely clear out the data.
45-
void Clear();
46-
47-
ChangedFileGroups(std::vector<BranchedFileGroup>& groups, int totalFileCount);
48-
49-
static std::unique_ptr<ChangedFileGroups> Empty() { return std::unique_ptr<ChangedFileGroups>(new ChangedFileGroups); };
50-
};
51-
52-
struct Branch
53-
{
54-
public:
55-
const std::string depotBranchPath;
56-
const std::string gitAlias;
57-
58-
Branch(const std::string& branch, const std::string& alias);
59-
60-
// splitBranchPath If the relativeDepotPath matches, returns {branch alias, branch file path}.
61-
// Otherwise, returns {"", ""}
62-
std::array<std::string, 2> SplitBranchPath(const std::string& relativeDepotPath) const;
63-
};
64-
65-
// A singular view on the branches and a base view (acts as a filter to trim down affected files).
66-
// Maps a changed file state to a list of resulting branches and affected files.
67-
struct BranchSet
68-
{
69-
private:
70-
// Technically, these should all be const.
71-
const bool m_includeBinaries;
72-
std::string m_basePath;
73-
const std::vector<Branch> m_branches;
74-
FileMap m_view;
75-
76-
// stripBasePath remove the base path from the depot path, or "" if not in the base path.
77-
std::string stripBasePath(const std::string& depotPath) const;
78-
79-
// splitBranchPath extract the branch name and path under the branch (no leading '/' on the path)
80-
// relativeDepotPath - already stripped from running stripBasePath.
81-
std::array<std::string, 2> splitBranchPath(const std::string& relativeDepotPath) const;
82-
83-
public:
84-
BranchSet(std::vector<std::string>& clientViewMapping, const std::string& baseDepotPath, const std::vector<std::string>& branches, const bool includeBinaries);
85-
86-
// HasMergeableBranch is there a branch model that requires integration history?
87-
bool HasMergeableBranch() const { return !m_branches.empty(); };
88-
89-
int Count() const { return m_branches.size(); };
90-
91-
// ParseAffectedFiles create collections of merges and commits.
92-
// Breaks up the files into those that are within the view, with each item in the
93-
// list is its own target Git branch.
94-
// This also has the side-effect of populating the relative path value in the file data.
95-
// ... the FileData object is copied, but it's underlying shared data is shared. So, this
96-
// breaks the const.
97-
std::unique_ptr<ChangedFileGroups> ParseAffectedFiles(const std::vector<FileData>& cl) const;
98-
};
1+
/*
2+
* Copyright (c) 2022 Salesforce, Inc.
3+
* All rights reserved.
4+
* SPDX-License-Identifier: BSD-3-Clause
5+
* For full license text, see the LICENSE.txt file in the repo root or https://opensource.org/licenses/BSD-3-Clause
6+
*/
7+
#pragma once
8+
9+
#include <string>
10+
#include <vector>
11+
#include <array>
12+
#include <memory>
13+
#include <stdexcept>
14+
15+
#include "commands/file_map.h"
16+
#include "commands/file_data.h"
17+
#include "commands/stream_result.h"
18+
#include "utils/std_helpers.h"
19+
20+
struct BranchedFileGroup
21+
{
22+
// If a BranchedFiles collection hasSource == true,
23+
// then all files in this collection MUST be a merge
24+
// from the given source branch to the target branch.
25+
// These branch names will be the Git branch names.
26+
std::string sourceBranch;
27+
std::string targetBranch;
28+
bool hasSource;
29+
std::vector<FileData> files;
30+
31+
// Get all the relative file names from each of the file data.
32+
std::vector<std::string> GetRelativeFileNames();
33+
};
34+
35+
struct ChangedFileGroups
36+
{
37+
private:
38+
ChangedFileGroups();
39+
40+
public:
41+
std::vector<BranchedFileGroup> branchedFileGroups;
42+
int totalFileCount;
43+
44+
// When all the file groups have finished being used,
45+
// only then can we safely clear out the data.
46+
void Clear();
47+
48+
ChangedFileGroups(std::vector<BranchedFileGroup>& groups, int totalFileCount);
49+
50+
static std::unique_ptr<ChangedFileGroups> Empty() { return std::unique_ptr<ChangedFileGroups>(new ChangedFileGroups); };
51+
};
52+
53+
struct Branch
54+
{
55+
public:
56+
const std::string depotBranchPath;
57+
const std::string gitAlias;
58+
59+
Branch(const std::string& branch, const std::string& alias);
60+
61+
// splitBranchPath If the relativeDepotPath matches, returns {branch alias, branch file path}.
62+
// Otherwise, returns {"", ""}
63+
std::array<std::string, 2> SplitBranchPath(const std::string& relativeDepotPath) const;
64+
};
65+
66+
// A singular view on the branches and a base view (acts as a filter to trim down affected files).
67+
// Maps a changed file state to a list of resulting branches and affected files.
68+
struct BranchSet
69+
{
70+
private:
71+
// Technically, these should all be const.
72+
const bool m_includeBinaries;
73+
std::string m_basePath;
74+
const std::vector<Branch> m_branches;
75+
const std::vector<StreamResult::MappingData> m_mappings;
76+
const std::vector<StreamResult::MappingData> m_exclusions;
77+
FileMap m_view;
78+
79+
// stripBasePath remove the base path from the depot path, or "" if not in the base path.
80+
std::string stripBasePath(const std::string& depotPath) const;
81+
82+
// splitBranchPath extract the branch name and path under the branch (no leading '/' on the path)
83+
// relativeDepotPath - already stripped from running stripBasePath.
84+
std::array<std::string, 2> splitBranchPath(const std::string& relativeDepotPath) const;
85+
86+
public:
87+
BranchSet(std::vector<std::string>& clientViewMapping, const std::string& baseDepotPath, const std::vector<std::string>& branches, const std::vector<StreamResult::MappingData>& mappings, const std::vector<StreamResult::MappingData>& exclusions, const bool includeBinaries);
88+
89+
// HasMergeableBranch is there a branch model that requires integration history?
90+
bool HasMergeableBranch() const { return !m_branches.empty(); };
91+
92+
int Count() const { return m_branches.size(); };
93+
94+
// ParseAffectedFiles create collections of merges and commits.
95+
// Breaks up the files into those that are within the view, with each item in the
96+
// list is its own target Git branch.
97+
// This also has the side-effect of populating the relative path value in the file data.
98+
// ... the FileData object is copied, but it's underlying shared data is shared. So, this
99+
// breaks the const.
100+
std::unique_ptr<ChangedFileGroups> ParseAffectedFiles(const std::vector<FileData>& cl) const;
101+
};

0 commit comments

Comments
 (0)