Skip to content

Meeting Notes

Manoj Kolpe edited this page Nov 18, 2022 · 44 revisions

November 18th

  • Android in the beginning
  • State-of-the-art comparison compare with less computation
  • Temporal fusion with 3D convolution
  • Why only we are using a few temporal fusion
  • Block diagram in small size or if its standard, don't include it
  • Tell the diagram with numbers for the dataset
  • Small charts?
  • Dataset in google colab
  • Statement in slides, small, concise statements
  • Skipping the frames experiments

November 9th

  • Skipping of frames
  • For the trained model change the batch size and see the results

November 4th

  • Experiment with the different batch sizes and the impact of the same on the output only for fog and morning
  • List the memory footprint.
  • Take eight continuous sequences images and compare them with the Vanilla, GP, and LSTM.

October 12th

  • Write the dataset variations.
  • Dice coefficient

September 20th

  • Transformers.
  • W-net.
  • U-net.
  • Vkitti result should be done.
  • Change the learning rate and loss function to improve the results of the scannet dataset.
  • Change the unet base model to improve the performance.

September 6th

  • Train on Furniture class for 10 epochs on vanilla, gp and lstm.
  • Learning rate scheduler
  • Tensorboard

August 23rd

  • No meeting this week.
  • Updated the report on chapter 2
  • Start writing the result in the experimental section
  • Focus on the metrics and explain the results
  • Plot the covariance plot

August 9th

August 2nd

  • Weighted cross-entropy, focal loss, cross entropy run for 2 hr and check which works well
  • AdamW, PyTorch learning rate finder
  • https://github.com/davidtvs/pytorch-lr-finder
  • Do it in the end: Learning scheduler, try two learning schedulers
  • Total 186 sequences, 149 remaining for validation and testing.
  • How to compare single-frame segmentation (Vanilla) vs. multi-frame segmentation (Latent frame fusion, GP)

July 26th

  • Reduce sequences
  • Is the sequences related to each other
  • Augmentation to improve accuracy
  • Without augmentation
  • Reduce the time for epoch

July 5th

  • Look at the paper on Infinite-Horizon Gaussian Processes
  • Gaussian process regression
  • Link to Gaussian process and linear regression
  • Linear regression with Gaussian process
  • Evaluation chapter, each section with the research question and result for that, and conclusion of that research question.
  • Evaluation is the important part
  • U-net with temporal fusion and W-Net compare the results
  • Evaluation removes the checking of results in the third sequence in the vanilla model

June 28th

  • Check on validation of temporal fusion
  • Train on one sequence and test on another sequence
  • How many sequences in the scannet dataset
  • Dataloader that works on multiple sequence
  • Create a dataset sequence loader
  • Sequence loader
  • Multi sequence, set of 2,3,4,5. Analysis of the impact of the gplayer
  • IOU
  • Accuracy
  • Precision
  • ROC curve
  • AUC curve
  • Replica dataset, habitat
  • Common framework for dataset loader

June 7th

  • Create a baseline with U-Net and train it with cityscape dataset
  • Add Gaussian in the the latent space
  • Mail to Prof.Nico and Prof. Houben with the research question and proposal
  • Research about the cost volume addition on the input by taking idea from MVS temporal nonparameteric fusion
  • Single and hybrid temporal fusion.
  • Create a bunch of dataset with respect to segmentation along with the camera poses.
  • Evaluation by either taking two or three consecutive images and check how they changes.

June 3rd

Feedback

Feedback

  1. Now your work has nothing to do with 3D reconstruction
  2. Application area is wrong
  3. Our goal is not 3D reconstruction
  4. Conclusion:
    • Change your introduction of the topic focus on the 2 things you said you will do
      • Semantic segmentation
      • Temporal fusion using camera poses
      • Efficient
  5. Focus on what the meeting is : Its not a dumping exercise . Let me dump whatever I know in this meeting . So the meeting was to sell the research question to the professor. To sell him why the problem is important. Keep the slides on what you have done till now as backup and show if he asks.

Questions :

  1. Not interested in depth estimation
  2. Only for segmentation
  3. Had an implementation for android Did you use IMU ?
  4. Sample the Gaussian process
  5. Interesting question Impact of Kernels .
    • Do we need to retrain when we change the kernel ? ?

May 31st

  • Mail to professor
  • Fine with the research questions
  • Downloaded the dataset
  • Temporal fusion + Low computational device

May 24th

Fine-tuning research questions

  • Question type: Characterization: Result: Qualitative and descriptive models
  • What are the works on state-of-the-art temporal nonparametric fusion?
  • How are the results from RQ1 compared with each other to perform temporal fusion?
  • Can we cross-transfer the temporal nonparametric fusion to the other tasks, such as object detection or segmentation?
  • Can we cross-transfer the temporal nonparametric fusion to the segmentation?
  • How is improving the resolution of the monocular image depth map in comparison with the state of the art resolution improving architecture?
  • How is incorporating cost volume on the decoder of temporal fusion improve the resolution of the depth map?
  • What is the work present in the temporal fusion domain?
  • Is the temporal nonparametric fusion architecture results reproducible?
  • What are the different error metrics available for the monocular depth estimation?
  • Question type: Discrimination: Result:System
  • How is the result obtained from incorporating a new error metric in temporal nonparametric monocular depth estimation in comparison with the previous work on monocular depth estimation?
  • Methods/Means: Analytical model
  • Can we incorporate adaptive thin volumes (ATVs) on the decoder of temporal nonparametric fusion to improve the resolution of the depth images? (Work is done in typical encoder-decoder architecture but no work is done with the temporal nonparametric fusion)
  • Comparison of results obtained by incorporating adaptive thin volumes on decoder with the resolution of the image obtained from standard depth estimation algorithm.
  • Can we incorporate temporal nonparametric fusion to the SegNet to perform segmentation of the input images? (No work is done with temporal nonparametric fusion on the encoder and decoder architecture with segmentation)
  • Comparison of results from this architecture to the standard segmentation architecture.
  • Can we perform image classification with temporal image data with the fusion of information in the latent space using the Gaussian process?

May 17th

  • Title - Efficient multi-view stereo depth estimation
  • Image related to depth estimation, two image multi-view, camera pose
  • Raise an issue related to the deployment in the android
  • Run the data using the code from demon GitHub page
  • Create a python pipeline in which we have single data we pass for two architecture and get the results and compare it.
  • https://github.com/robustrobotics/multi_view_stereonet
  • Download the data from https://github.com/lmb-freiburg/demon
  • Look for papers that use a fusion of information in latent space.
  • Study of gaussian
  • RQ2 Functionality testing using different cost volumes, Gaussian process - split into two questions
  • RQ3 Cross application of MVS approach
  • Remove robustness
  • Comparison of MVS and latest paper result table in experiments and write a question and answer it.
  • Look for papers related to MVS using two images, fusion in latent space (Hybrid fusion), depth estimation

May 12th

Feedback

Presentation ok ok - Not aligned - Images at random location - Its ok for a discussion but still you can make it a habit now so that in defense u don't make the mistakes - Explanations were not crisp . Good for first time but we need a crisp well-rehearsed explanation of the methods and the techniques especially of the MVS paper.

  1. Example image is not good

    • We need a better example image
  2. The introduction is way too high level .

    • Too much ground to cover to reach to actual topic
  3. Animate the MVS architecture .

  4. The explanation of the mobile development needs a way better diagram (pipeline diagram) . and also needs to be animated . while you talk .

Research Questions : RQ1 : Robustness MVS approaches RQ2 : Functionality testing using different cost volume, gaussian process RQ3 : Cross application of MVS approach

May 10th

  • Add block diagram to deployment markdown file.
  • Forward snowballing (Check the paper citation) also check the papers that are cited
  • Literature review. 1. General Multi-view stereo (Select stereo correspondence, cost) 2. Depth estimation (Monocular, stereo, passive, active, mobile) 3. Android deployment.
  • Add noise to the data and check how the model performs
  • Change cost volume, and error calculation.
  • What if we change the Gaussian process to the studio process
  • Find at what parameter the model fails and provide the improvement (Best).
  • Priority to a baseline building.

May 03rd

  • Discussed the different ways of deploying a PyTorch model on the android
  • Work on the milestones

April 26th

  • Make a markdown file to convert the python to java code.
  • Deploying python code on android.
  • Reproduce the result from the paper.
  • Load the dataset and make it run.

April 19th

  • Discussion on the GitHub folder structure
  • Discussed the extraction of images, quaternions, and translation matrix from the android device
  • What is the baseline for the captured image?
  • We don't need to store the data for real-time application
  • Can we run the deep learning model on the android device?
  • How Gaussian process helps in the depth estimation algorithm
  • Target to reproduce the result from the pretrained model

April 12th

  • Discussion on the Problem Statement

March 08

  • Last Meeting Follow-up, Week 06, 10/02/2022

  • Overview of the paper.

  • Review of code elements.

  • Overview of the proposal.

  • Cost volume computation.

  • Able to run the test.py

  • Dataset problem.

  • encoder(left_image_cuda, right_image_cuda, KRKiUV_cuda_T,KT_cuda_T) computation.

  • Epipolar geometry. Video lecture

  • Today's meeting, Week 07, 17/02/2022

  • Study gaussian processes

  • Learn GPplayer code

  • Next week's meeting agenda, Week 06,

  • Talk about the multiview Multi-View Stereo by Temporal Nonparametric Fusion topic.

Feb 17th

  • Overview of the paper.
  • Cost volume construction.
  • Look at the rough view of the code.
  • The encoder architecture takes two images along with pose and camera parameters.
  • Learn about the encoder, decoder, and Gplayer function.
  • Read epipolar geometry notes.
  • Read about the gaussian process and prior.
  • Understand the cost volume computation.
  • Implement the MVDepthNet.
  • How is the Gaussian process used in this case?
  • Try to run the code.
  • Can we use the same model for semantic segmentation? (Highest expected output of the thesis, suggest a model or future work)
  • Is there any work on semantic segmentation using cost volume?
  • Can we run the code on an android phone?
  • Same architecture applied with a different dataset than the mentioned dataset?
  • Prepare a rough proposal and expected output of the master thesis?
  • Next week's meeting agenda, Week 06,
  • Talk about the multiview Multi-View Stereo by Temporal Nonparametric Fusion topic.

Feb 01

  • Master thesis topic discussion.
  • Had a Rough overview of the topic from Deebul.
  • Today's meeting agenda, Week 05, 01/02/2022
  • Talk about the multiview Multi-View Stereo by Temporal Nonparametric Fusion topic.
  • Had a rough overview of the multi-view stereo depth estimation.
  • Disparity estimation basic concept understanding.
  • Looked through the code.
  • Cost volume estimation from MVDepth paper. (I didn’t understand fully)
  • Algorithm implementation code in IOS is not available. Contact the author??
  • How to go about starting the project??
  • Start with the literature??
  • Problem formulation??

Next week meeting agenda, Week 06, Talk about the multiview Multi-View Stereo by Temporal Nonparametric Fusion topic.