Meeting Notes

November 18th

Android in the beginning
State-of-the-art comparison compare with less computation
Temporal fusion with 3D convolution
Why only we are using a few temporal fusion
Block diagram in small size or if its standard, don't include it
Tell the diagram with numbers for the dataset
Small charts?
Dataset in google colab
Statement in slides, small, concise statements
Skipping the frames experiments

November 9th

Skipping of frames
For the trained model change the batch size and see the results

November 4th

Experiment with the different batch sizes and the impact of the same on the output only for fog and morning
List the memory footprint.
Take eight continuous sequences images and compare them with the Vanilla, GP, and LSTM.

October 12th

Write the dataset variations.
Dice coefficient

September 20th

Transformers.
W-net.
U-net.
Vkitti result should be done.
Change the learning rate and loss function to improve the results of the scannet dataset.
Change the unet base model to improve the performance.

September 6th

Train on Furniture class for 10 epochs on vanilla, gp and lstm.
Learning rate scheduler
Tensorboard

August 23rd

No meeting this week.
Updated the report on chapter 2
Start writing the result in the experimental section
Focus on the metrics and explain the results
Plot the covariance plot

August 9th

Focus on single class : https://github.com/qubvel/segmentation_models.pytorch/blob/master/examples/cars%20segmentation%20(camvid).ipynb
One class accuracy
Plot the output
Which fusion method is good for scannet data?
LSTM shuffle = False
100 epoch, every 20th epoch check for validation accuracy
Randomly select the sequence

August 2nd

Weighted cross-entropy, focal loss, cross entropy run for 2 hr and check which works well
AdamW, PyTorch learning rate finder
https://github.com/davidtvs/pytorch-lr-finder
Do it in the end: Learning scheduler, try two learning schedulers
Total 186 sequences, 149 remaining for validation and testing.
How to compare single-frame segmentation (Vanilla) vs. multi-frame segmentation (Latent frame fusion, GP)

July 26th

Reduce sequences
Is the sequences related to each other
Augmentation to improve accuracy
Without augmentation
Reduce the time for epoch

July 5th

Look at the paper on Infinite-Horizon Gaussian Processes
Gaussian process regression
Link to Gaussian process and linear regression
Linear regression with Gaussian process
Evaluation chapter, each section with the research question and result for that, and conclusion of that research question.
Evaluation is the important part
U-net with temporal fusion and W-Net compare the results
Evaluation removes the checking of results in the third sequence in the vanilla model

June 28th

Check on validation of temporal fusion
Train on one sequence and test on another sequence
How many sequences in the scannet dataset
Dataloader that works on multiple sequence
Create a dataset sequence loader
Sequence loader
Multi sequence, set of 2,3,4,5. Analysis of the impact of the gplayer
IOU
Accuracy
Precision
ROC curve
AUC curve
Replica dataset, habitat
Common framework for dataset loader

June 7th

Create a baseline with U-Net and train it with cityscape dataset
Add Gaussian in the the latent space
Mail to Prof.Nico and Prof. Houben with the research question and proposal
Research about the cost volume addition on the input by taking idea from MVS temporal nonparameteric fusion
Single and hybrid temporal fusion.
Create a bunch of dataset with respect to segmentation along with the camera poses.
Evaluation by either taking two or three consecutive images and check how they changes.

June 3rd

Feedback

Now your work has nothing to do with 3D reconstruction
Application area is wrong
Our goal is not 3D reconstruction
Conclusion:
- Change your introduction of the topic focus on the 2 things you said you will do
  - Semantic segmentation
  - Temporal fusion using camera poses
  - Efficient
Focus on what the meeting is : Its not a dumping exercise . Let me dump whatever I know in this meeting . So the meeting was to sell the research question to the professor. To sell him why the problem is important. Keep the slides on what you have done till now as backup and show if he asks.

Questions :

Not interested in depth estimation
Only for segmentation
Had an implementation for android Did you use IMU ?
Sample the Gaussian process
Interesting question Impact of Kernels .
- Do we need to retrain when we change the kernel ? ?

May 31st

Mail to professor
Fine with the research questions
Downloaded the dataset
Temporal fusion + Low computational device

May 24th

Fine-tuning research questions

Question type: Characterization: Result: Qualitative and descriptive models
What are the works on state-of-the-art temporal nonparametric fusion?
How are the results from RQ1 compared with each other to perform temporal fusion?
Can we cross-transfer the temporal nonparametric fusion to the other tasks, such as object detection or segmentation?
Can we cross-transfer the temporal nonparametric fusion to the segmentation?
How is improving the resolution of the monocular image depth map in comparison with the state of the art resolution improving architecture?
How is incorporating cost volume on the decoder of temporal fusion improve the resolution of the depth map?
What is the work present in the temporal fusion domain?
Is the temporal nonparametric fusion architecture results reproducible?
What are the different error metrics available for the monocular depth estimation?
Question type: Discrimination: Result:System
How is the result obtained from incorporating a new error metric in temporal nonparametric monocular depth estimation in comparison with the previous work on monocular depth estimation?
Methods/Means: Analytical model
Can we incorporate adaptive thin volumes (ATVs) on the decoder of temporal nonparametric fusion to improve the resolution of the depth images? (Work is done in typical encoder-decoder architecture but no work is done with the temporal nonparametric fusion)
Comparison of results obtained by incorporating adaptive thin volumes on decoder with the resolution of the image obtained from standard depth estimation algorithm.
Can we incorporate temporal nonparametric fusion to the SegNet to perform segmentation of the input images? (No work is done with temporal nonparametric fusion on the encoder and decoder architecture with segmentation)
Comparison of results from this architecture to the standard segmentation architecture.
Can we perform image classification with temporal image data with the fusion of information in the latent space using the Gaussian process?

May 17th

Title - Efficient multi-view stereo depth estimation
Image related to depth estimation, two image multi-view, camera pose
Raise an issue related to the deployment in the android
Run the data using the code from demon GitHub page
Create a python pipeline in which we have single data we pass for two architecture and get the results and compare it.
https://github.com/robustrobotics/multi_view_stereonet
Download the data from https://github.com/lmb-freiburg/demon
Look for papers that use a fusion of information in latent space.
Study of gaussian
RQ2 Functionality testing using different cost volumes, Gaussian process - split into two questions
RQ3 Cross application of MVS approach
Remove robustness
Comparison of MVS and latest paper result table in experiments and write a question and answer it.
Look for papers related to MVS using two images, fusion in latent space (Hybrid fusion), depth estimation

May 12th

Feedback

Presentation ok ok - Not aligned - Images at random location - Its ok for a discussion but still you can make it a habit now so that in defense u don't make the mistakes - Explanations were not crisp . Good for first time but we need a crisp well-rehearsed explanation of the methods and the techniques especially of the MVS paper.

Example image is not good
- We need a better example image
The introduction is way too high level .
- Too much ground to cover to reach to actual topic
Animate the MVS architecture .
The explanation of the mobile development needs a way better diagram (pipeline diagram) . and also needs to be animated . while you talk .

Research Questions : RQ1 : Robustness MVS approaches RQ2 : Functionality testing using different cost volume, gaussian process RQ3 : Cross application of MVS approach

May 10th

Add block diagram to deployment markdown file.
Forward snowballing (Check the paper citation) also check the papers that are cited
Literature review. 1. General Multi-view stereo (Select stereo correspondence, cost) 2. Depth estimation (Monocular, stereo, passive, active, mobile) 3. Android deployment.
Add noise to the data and check how the model performs
Change cost volume, and error calculation.
What if we change the Gaussian process to the studio process
Find at what parameter the model fails and provide the improvement (Best).
Priority to a baseline building.

May 03rd

Discussed the different ways of deploying a PyTorch model on the android
Work on the milestones

April 26th

Make a markdown file to convert the python to java code.
Deploying python code on android.
Reproduce the result from the paper.
Load the dataset and make it run.

April 19th

Discussion on the GitHub folder structure
Discussed the extraction of images, quaternions, and translation matrix from the android device
What is the baseline for the captured image?
We don't need to store the data for real-time application
Can we run the deep learning model on the android device?
How Gaussian process helps in the depth estimation algorithm
Target to reproduce the result from the pretrained model

April 12th

Discussion on the Problem Statement

March 08

Last Meeting Follow-up, Week 06, 10/02/2022
Overview of the paper.
Review of code elements.
Overview of the proposal.
Cost volume computation.
Able to run the test.py
Dataset problem.
encoder(left_image_cuda, right_image_cuda, KRKiUV_cuda_T,KT_cuda_T) computation.
Epipolar geometry. Video lecture
Today's meeting, Week 07, 17/02/2022
Study gaussian processes
Learn GPplayer code
Next week's meeting agenda, Week 06,
Talk about the multiview Multi-View Stereo by Temporal Nonparametric Fusion topic.

Feb 17th

Overview of the paper.
Cost volume construction.
Look at the rough view of the code.
The encoder architecture takes two images along with pose and camera parameters.
Learn about the encoder, decoder, and Gplayer function.
Read epipolar geometry notes.
Read about the gaussian process and prior.
Understand the cost volume computation.
Implement the MVDepthNet.
How is the Gaussian process used in this case?
Try to run the code.
Can we use the same model for semantic segmentation? (Highest expected output of the thesis, suggest a model or future work)
Is there any work on semantic segmentation using cost volume?
Can we run the code on an android phone?
Same architecture applied with a different dataset than the mentioned dataset?
Prepare a rough proposal and expected output of the master thesis?
Next week's meeting agenda, Week 06,
Talk about the multiview Multi-View Stereo by Temporal Nonparametric Fusion topic.

Feb 01

Master thesis topic discussion.
Had a Rough overview of the topic from Deebul.
Today's meeting agenda, Week 05, 01/02/2022
Talk about the multiview Multi-View Stereo by Temporal Nonparametric Fusion topic.
Had a rough overview of the multi-view stereo depth estimation.
Disparity estimation basic concept understanding.
Looked through the code.
Cost volume estimation from MVDepth paper. (I didn’t understand fully)
Algorithm implementation code in IOS is not available. Contact the author??
How to go about starting the project??
Start with the literature??
Problem formulation??

Next week meeting agenda, Week 06, Talk about the multiview Multi-View Stereo by Temporal Nonparametric Fusion topic.

This is the wiki page for the multi-view stereo project

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Meeting Notes

November 18th

November 9th

November 4th

October 12th

September 20th

September 6th

August 23rd

August 9th

August 2nd

July 26th

July 5th

June 28th

June 7th

June 3rd

Feedback

Feedback

May 31st

May 24th

May 17th

May 12th

Feedback

May 10th

May 03rd

April 26th

April 19th

April 12th

March 08

Feb 17th

Feb 01

Clone this wiki locally