Possible bugs when generating inputs for dataflow pruning

Currently the generated inputs for dataflow pruning looks something like this:
```
        Var 0 : -1
        Var 0 : 1
        Var 0 : 0
        Var 0 : 2147483647
        Var 0 : -2147483648
        Var 0 : 0
        Var 0 : 1
        Var 0 : -1
        Var 0 : 2147483647
        Var 0 : -2147483648
        Var 0 : 520932930
        Var 0 : 28925691
        Var 0 : 822784415
        Var 0 : 890459872
        Var 0 : 145532761
        Var 0 : 1
        Var 0 : 26
        Var 0 : 1
        Var 0 : 6
        Var 0 : 1
```
Inputs 0,1,-1, 2147483647, -2147483647 are "special inputs" and the remaining values are supposed to be random. Since by default only 10 inputs are tested for each candidate, random inputs are never used. The special inputs are generated twice because there are currently two implementations. 

In Pruning.cpp line 761:
```
  constexpr unsigned PermutedLimit = 15;
  std::string specialInputs = "abcde";
  std::unordered_set<std::string> Visited;
  do {
    int i = 0;
    std::string usedInput;
    for (auto &&I : Inputs) {
      if (I->K == souper::Inst::Var) {
        usedInput.append(1, specialInputs[i]);
        Cache[I] = getSpecialAPInt(specialInputs[i++], I->Width);
      }
    }

    if (!Visited.count(usedInput) && isInputValid(Cache)) {
      if (InputSets.size() >= PermutedLimit) break;
      InputSets.push_back(Cache);
      Visited.insert(usedInput);
    }
  } while (std::next_permutation(specialInputs.begin(), specialInputs.end()));
```
And right below it:
```
  for (auto &&I : Inputs) {
    if (I->K == souper::Inst::Var)
      Cache[I] = {llvm::APInt(I->Width, 0)};
  }
  if (isInputValid(Cache))
    InputSets.push_back(Cache);

  for (auto &&I : Inputs) {
    if (I->K == souper::Inst::Var)
      Cache[I] = {llvm::APInt(I->Width, 1)};
  }
  if (isInputValid(Cache))
    InputSets.push_back(Cache);

  for (auto &&I : Inputs) {
    if (I->K == souper::Inst::Var)
      Cache[I] = {llvm::APInt(I->Width, -1)};
  }
  if (isInputValid(Cache))
    InputSets.push_back(Cache);

  for (auto &&I : Inputs) {
    if (I->K == souper::Inst::Var)
      Cache[I] = {llvm::APInt::getSignedMaxValue(I->Width)};
  }
  if (isInputValid(Cache))
    InputSets.push_back(Cache);

  for (auto &&I : Inputs) {
    if (I->K == souper::Inst::Var)
      Cache[I] = {llvm::APInt::getSignedMinValue(I->Width)};
  }
  if (isInputValid(Cache))
    InputSets.push_back(Cache);
```
Commenting out either of these only adds the special inputs once which avoids testing the same input multiple times and makes room for random inputs.
Random Inputs are generated In Pruning.cpp line 817:
```
  constexpr int MaxTries = 100;
  constexpr int NumLargeInputs = 5;
  std::srand(0);
  int i, m;
  for (i = 0, m = 0; i < NumLargeInputs && m < MaxTries; ++m ) {
    for (auto &&I : Inputs) {
      if (I->K == souper::Inst::Var)
        Cache[I] = {llvm::APInt(I->Width, std::rand() % llvm::APInt(I->Width, -1).getLimitedValue())};
    }
    if (isInputValid(Cache)) {
      i++;
      InputSets.push_back(Cache);
    }
  }

  if (StatsLevel > 2 && i < NumLargeInputs) {
    llvm::errs() << "MaxTries (100) exhausted searching for large inputs.\n";
  }

  constexpr int NumSmallInputs = 5;
  for (i = 0, m = 0; i < NumSmallInputs && m < MaxTries; ++m ) {
    for (auto &&I : Inputs) {
      if (I->K == souper::Inst::Var)
        Cache[I] = {llvm::APInt(I->Width, std::rand() % I->Width)};
    }
    if (isInputValid(Cache)) {
      i++;
      InputSets.push_back(Cache);
    }
  }

  if (StatsLevel > 2 && i < NumSmallInputs) {
    llvm::errs() << "MaxTries (100) exhausted searching for small inputs.\n";
  }
``` 
The random inputs are never reseeded, but changing `std::srand(0);` to `std::srand(time(0));` makes random numbers work as intended. Also, when generating small inputs they have a range of 0 to I->Width. This leads to a lot of repeat values since MaxTries is 100 and you could only be generating values from 0 to 8 for example. Maybe replacing `Cache[I] = {llvm::APInt(I->Width, std::rand() % I->Width)};` with something like `Cache[I] = {llvm::APInt(I->Width, std::rand() % (some value depending on width))};` would lead to less repeat values?

I have fixes for these issues and can make a pull request. I just wanted to double check first since I didn't know which implementation of adding special inputs to remove since they both work. I also made the max amount of inputs to try user controllable which is part of a TODO on Pruning.cpp line 246. No changes were made to the range of random small inputs but I fixed the srand.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Possible bugs when generating inputs for dataflow pruning #870

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Possible bugs when generating inputs for dataflow pruning #870

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions