Skip to content

Commit 2dbc92c

Browse files
committed
constrainted-hmm: Notes on application to AD on cyclic processes
1 parent 4f817cc commit 2dbc92c

File tree

2 files changed

+72
-24
lines changed

2 files changed

+72
-24
lines changed

handson/constrained-hmm/README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,7 @@ Some real-world examples are referenced at the bottom of this page.
2626

2727
- Provide an example on real data.
2828
For example fitting a repeated sequential (cyclic) process, such as those found in automation/manufacturing.
29+
Maybe from MMII dataset?
2930

3031
## Implementation
3132

handson/constrained-hmm/notes.md

Lines changed: 71 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -1,40 +1,87 @@
11

22
## Others that want to do this
33

4+
Constrains inside the EM loop
5+
46
- https://github.com/jmschrei/pomegranate/issues/9
57

6-
## Implementation notes
8+
# Anomaly Detection for cyclic behavior
9+
10+
In manufacturing and automation, cycling behavior is common in several processes.
11+
For example in assembly, a machine might first do A, then B, then C,
12+
and correct outcome depends on the correct order of operations, as well as correct operation inside each.
13+
14+
In such a system, one want to be able to detect anomalies, such as:
15+
16+
- Incorrect order of operations/states (in a cycle)
17+
- Cycles that do not complex / are aborted prematurely
18+
- Cycles that take abnormally long (or short) time
19+
- States (in a cycle) that are abnormally short or long
20+
- Anomalous data inside a particular stage
21+
22+
Thesis. A typical automation loop is be well approximated with a HMM, using a `linear` topology.
23+
This, along with some preprocessing, should enable detecting all these kinds of anomalies.
24+
25+
A constrained HMM approach is particularly attractive when
26+
different stages in cycle have different lengths (possibly also varying in length).
27+
28+
29+
## Datasets
30+
31+
#### Genesis Demonstrator
32+
https://www.kaggle.com/datasets/inIT-OWL/genesis-demonstrator-data-for-machine-learning
33+
34+
- Machine is ortable pick-and-place, with air powered gripper
35+
- Sorts different two different materials (wood and metal) into their corresponding target locations
36+
- Materials can come from
37+
- 8 states in the State Machine of the program.
38+
- 3 kinds of anomalies introduced in different sub-datasets: linear drive jam, linear drive gradual impairment, air pressure gradual drop
39+
- Rrecords 5(+4) continuous signals, 13 discrete signals
40+
41+
#### Bosch Research CNC Machining Data
42+
https://github.com/boschresearch/CNC_Machining
43+
44+
- Tri-axial accelerometer (Bosch CISS Sensor) mounted inside the machine.
45+
- Sampling rate equal to 2 kHz.
46+
- Thereby normal as well as anomoulous data have been collected for 6 different timeframes, each lasting 6 months from October 2018 until August 2021 and labelled accordingly.
47+
- Data from three different CNC milling machines each executing 15 processes.
48+
49+
#### CNC turning: roughness, forces and tool wear
50+
https://www.kaggle.com/datasets/adorigueto/cnc-turning-roughness-forces-and-tool-wear?select=Prep.csv
51+
52+
Surface roughness was measured on six different spots after each machining run
53+
54+
? might all be normal. Not sure if there are outliers/anomaly conditions present.
55+
56+
#### Turning Dataset for Chatter Diagnosis Using Machine Learning
57+
Dataset: https://data.mendeley.com/datasets/hvm4wh3jzx/1
58+
Paper: https://arxiv.org/abs/1905.08671
759

8-
In `pomegranate`, the HMM models have a public callback API.
9-
The callback `on_epoch_end` is called on each iteration.
10-
https://github.com/jmschrei/pomegranate/blob/f115a242a5b50854bbf199d43fe2cfd061e9708a/pomegranate/hmm.pyx#L2715
11-
The callback gets the model instance as self.model before on_training_begin
12-
But there is no way to get or set the transitions during training, as they are C arrays inside Cython
13-
Clean solution would be to expose an optional callback for modifying this.
14-
Otherwise have to stick to the workaround from https://github.com/jmschrei/pomegranate/issues/9
60+
- Two perpendicular single axis accelerometers, a tri-axial accelerometer, a microphone, and a laser tachometer.
61+
- no-chatter, intermediate chatter, chatter, and unknown.
62+
- The cutting test is performed by turning an Aluminum 6061 workpiece on a Clasuing-Gamet 33 cm (13 inch) engine lathe
63+
- four different cutting con figurations were collected where each cutting con figuration depends on the stickout distance
64+
- For each stickout distance, we collect data for several combinations of the rotational speed and depth of cut
1565

16-
> I am using three vector approach to sparse matrices, but each vector is a private attribute right now
17-
> I'll add in a method which takes in either a dense or sparse matrix and calculates a new internal transition matrix from that.
66+
800 MB compressed, 6 GB uncompressed.
1867

19-
self.out_transition_log_probabilities
20-
self.in_transition_log_probabilities
21-
? which is the last one?
68+
### Custom data collection / demonstration
69+
Can maybe be done with a 3d printer, CNC machine.
70+
Making the same part repeatedly.
71+
Ideally a part that is simple to make, such that one can collect many times.
72+
Need a way to introduce realistic/plausible anomalies, in a safe manner.
2273

23-
Can have multiple batches per epoch.
24-
Calls self.summarize for each batch
25-
and then self.from_summaries for each epoch
2674

27-
It is in self.bake that transition_log_probabilities gets created, from the self.graph instance
28-
Could create methods like
29-
def get_transition_matrix()
30-
def set_transition_matrix()
31-
Would also need the state_name to index mapping to do much interesting
75+
### Manufacturing process as repeated-cycle
3276

33-
However for simple dissallowing of certain edges, one can use.
77+
In CNC mills or lathes, making a part is often a series of cuts.
78+
Combined this makes one very long cycle - unique to a part.
79+
One can of course model this particular part/program as a long sequence of states, one state per unique cut.
3480

35-
BUT - this does not implement k-means initialization, which has to be done manually
81+
But one can maybe also model individual cuts as repeated instances of the same cycles (enter, cut, leave),
82+
without modelling each cut as a separate state?
83+
Perhaps they can be similar enough that anomalies can be detected.
3684

37-
In `sequentia`, the models just use hmmlearn internally, so one would need to use the hmmlearn approach there.
3885

3986
## VAD
4087

0 commit comments

Comments
 (0)