Skip to content

Possible bugs when generating inputs for dataflow pruning #870

Open
@liamsemeria

Description

Currently the generated inputs for dataflow pruning looks something like this:

        Var 0 : -1
        Var 0 : 1
        Var 0 : 0
        Var 0 : 2147483647
        Var 0 : -2147483648
        Var 0 : 0
        Var 0 : 1
        Var 0 : -1
        Var 0 : 2147483647
        Var 0 : -2147483648
        Var 0 : 520932930
        Var 0 : 28925691
        Var 0 : 822784415
        Var 0 : 890459872
        Var 0 : 145532761
        Var 0 : 1
        Var 0 : 26
        Var 0 : 1
        Var 0 : 6
        Var 0 : 1

Inputs 0,1,-1, 2147483647, -2147483647 are "special inputs" and the remaining values are supposed to be random. Since by default only 10 inputs are tested for each candidate, random inputs are never used. The special inputs are generated twice because there are currently two implementations.

In Pruning.cpp line 761:

  constexpr unsigned PermutedLimit = 15;
  std::string specialInputs = "abcde";
  std::unordered_set<std::string> Visited;
  do {
    int i = 0;
    std::string usedInput;
    for (auto &&I : Inputs) {
      if (I->K == souper::Inst::Var) {
        usedInput.append(1, specialInputs[i]);
        Cache[I] = getSpecialAPInt(specialInputs[i++], I->Width);
      }
    }

    if (!Visited.count(usedInput) && isInputValid(Cache)) {
      if (InputSets.size() >= PermutedLimit) break;
      InputSets.push_back(Cache);
      Visited.insert(usedInput);
    }
  } while (std::next_permutation(specialInputs.begin(), specialInputs.end()));

And right below it:

  for (auto &&I : Inputs) {
    if (I->K == souper::Inst::Var)
      Cache[I] = {llvm::APInt(I->Width, 0)};
  }
  if (isInputValid(Cache))
    InputSets.push_back(Cache);

  for (auto &&I : Inputs) {
    if (I->K == souper::Inst::Var)
      Cache[I] = {llvm::APInt(I->Width, 1)};
  }
  if (isInputValid(Cache))
    InputSets.push_back(Cache);

  for (auto &&I : Inputs) {
    if (I->K == souper::Inst::Var)
      Cache[I] = {llvm::APInt(I->Width, -1)};
  }
  if (isInputValid(Cache))
    InputSets.push_back(Cache);

  for (auto &&I : Inputs) {
    if (I->K == souper::Inst::Var)
      Cache[I] = {llvm::APInt::getSignedMaxValue(I->Width)};
  }
  if (isInputValid(Cache))
    InputSets.push_back(Cache);

  for (auto &&I : Inputs) {
    if (I->K == souper::Inst::Var)
      Cache[I] = {llvm::APInt::getSignedMinValue(I->Width)};
  }
  if (isInputValid(Cache))
    InputSets.push_back(Cache);

Commenting out either of these only adds the special inputs once which avoids testing the same input multiple times and makes room for random inputs.
Random Inputs are generated In Pruning.cpp line 817:

  constexpr int MaxTries = 100;
  constexpr int NumLargeInputs = 5;
  std::srand(0);
  int i, m;
  for (i = 0, m = 0; i < NumLargeInputs && m < MaxTries; ++m ) {
    for (auto &&I : Inputs) {
      if (I->K == souper::Inst::Var)
        Cache[I] = {llvm::APInt(I->Width, std::rand() % llvm::APInt(I->Width, -1).getLimitedValue())};
    }
    if (isInputValid(Cache)) {
      i++;
      InputSets.push_back(Cache);
    }
  }

  if (StatsLevel > 2 && i < NumLargeInputs) {
    llvm::errs() << "MaxTries (100) exhausted searching for large inputs.\n";
  }

  constexpr int NumSmallInputs = 5;
  for (i = 0, m = 0; i < NumSmallInputs && m < MaxTries; ++m ) {
    for (auto &&I : Inputs) {
      if (I->K == souper::Inst::Var)
        Cache[I] = {llvm::APInt(I->Width, std::rand() % I->Width)};
    }
    if (isInputValid(Cache)) {
      i++;
      InputSets.push_back(Cache);
    }
  }

  if (StatsLevel > 2 && i < NumSmallInputs) {
    llvm::errs() << "MaxTries (100) exhausted searching for small inputs.\n";
  }

The random inputs are never reseeded, but changing std::srand(0); to std::srand(time(0)); makes random numbers work as intended. Also, when generating small inputs they have a range of 0 to I->Width. This leads to a lot of repeat values since MaxTries is 100 and you could only be generating values from 0 to 8 for example. Maybe replacing Cache[I] = {llvm::APInt(I->Width, std::rand() % I->Width)}; with something like Cache[I] = {llvm::APInt(I->Width, std::rand() % (some value depending on width))}; would lead to less repeat values?

I have fixes for these issues and can make a pull request. I just wanted to double check first since I didn't know which implementation of adding special inputs to remove since they both work. I also made the max amount of inputs to try user controllable which is part of a TODO on Pruning.cpp line 246. No changes were made to the range of random small inputs but I fixed the srand.

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions