Skip to content

DeveloperAcademy-POSTECH/2024-NC2-A21-MachineLearning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

49 Commits
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

2024-NC2-A21-MachineLearning

๐ŸŽฅ Youtube Link

(์ถ”ํ›„ ๋งŒ๋“ค์–ด์ง„ ์œ ํŠœ๋ธŒ ๋งํฌ ์ถ”๊ฐ€)

๐Ÿ’ก About Augmented Reality

(ํ•ด๋‹น ๊ธฐ์ˆ ์— ๋Œ€ํ•œ ์กฐ์‚ฌ ๋‚ด์šฉ ์ •๋ฆฌ) image

  • ์ด๋ฏธ์ง€๋‚˜ ํ…์ŠคํŠธ, ์‚ฌ์šด๋“œ๋“ฑ์„ ํ•™์Šตํ•˜์—ฌ ์ƒํ™ฉ์— ๋Œ€ํ•œ ๋™์ž‘์„ ์ผ์ผ์ด ์ง€์ •ํ•˜๋Š” ๋Œ€์‹ , ์Šค์Šค๋กœ ํ•™์Šตํ•˜์—ฌ ๋Œ€์‘ํ•˜๋„๋ก ํ•˜๋Š” ๋ฐฉ๋ฒ•๋ก ์ด๋‹ค
  • ๋ฌผ์ฒด๋‚˜ ์†Œ๋ฆฌ๋ฅผ ์ธ์‹ํ•˜๊ฑฐ๋‚˜ ๋ถ„์„ํ•  ์ˆ˜ ์žˆ๊ณ  ์‚ฌ์šฉ์ž์˜ ํŒจํ„ด์„ ํŒŒ์•…ํ•ด ์–ด๋– ํ•œ ๊ฒฐ๊ณผ๋ฅผ ์ถ”์ฒœํ•ด์ฃผ๊ฑฐ๋‚˜ ์˜ˆ์ธกํ•  ์ˆ˜๋„ ์žˆ๋‹ค. image

๐ŸŽฏ What we focus on?

image

SoundClassification

  • ์ˆ˜๋งŽ์€ ์•„์ด๋””์–ด์ค‘ ์˜ค๋ธŒ์ ํŠธ ๋ถ„๋ฅ˜์™€ ์†Œ๋ฆฌ๋ฅผ ๋ถ„์„ํ•ด์ฃผ๋Š”๊ฒƒ์— ํฅ๋ฏธ๋ฅผ ๋А๊ผˆ๊ณ  ๊ทธ์ค‘ ์†Œ๋ฆฌ์— ์ง‘์ค‘ํ•ด๋ณด๊ณ  ์‹ถ์€ ์ƒ๊ฐ์ด ๋“ค์—ˆ์Šต๋‹ˆ๋‹ค
  • ์†Œ๋ฆฌ๋ฅผ ๊ณผ์—ฐ ์–ด๋–ป๊ฒŒ ๋ถ„์„ํ•˜๋Š”์ง€ ์›๋ฆฌ๋ฅผ ์•Œ์ง€ ๋ชปํ•ด์„œ ์ด๋ฒˆ ๊ธฐํšŒ์— ๋ฐฐ์›Œ๋ณด๊ณ ์‹ถ์—ˆ์Šต๋‹ˆ๋‹ค

AVFoundation

  • input ๊ฐ’์œผ๋กœ ์˜ค๋””์˜ค๋ฅผ ์‚ฌ์šฉํ•˜๊ธฐ ์œ„ํ•ด, ๋…น์Œ ํ•  ์ˆ˜ ์žˆ๋Š” ๊ธฐ๋Šฅ์ด ํ•„์š”ํ•˜๋‹ค๊ณ  ์ƒ๊ฐํ–ˆ์Šต๋‹ˆ๋‹ค

Idation

image image

๐Ÿ’ผ Use Case

image

๐Ÿ–ผ๏ธ Prototype

1. ๋ฉ”์ธํ™”๋ฉด

image

  • Start ๋ฒ„ํŠผ์„ ๋ˆ„๋ฅด๋ฉด recordView (์Œ์„ฑ๋…น์Œํ™”๋ฉด)์œผ๋กœ ์ง„์ž…ํ•ฉ๋‹ˆ๋‹ค

2. ์Œ์„ฑ ๋…น์Œํ™”๋ฉด

image

  • ์‚ฌ์šฉ์ž๊ฐ€ ๋ณด๊ณ  ์ฝ์„ ์ˆ˜ ์žˆ๋Š” ์ง€๋ฌธ์„ ๋žœ๋ค์œผ๋กœ ์ œ์‹œํ•ด ์ค๋‹ˆ๋‹ค
  • ์Œ์„ฑ ๋…น์Œ ๋ฒ„ํŠผ์„ ๋ˆ„๋ฅด๋ฉด ์Œ์„ฑ์ด ๋…น์Œ๋ฉ๋‹ˆ๋‹ค
  • ๋…น์Œ ์ •์ง€ ๋ฒ„ํŠผ์„ ๋ˆ„๋ฅด๋ฉด ๋ฐ”๋กœ ๋ถ„์„ ๊ฒฐ๊ณผ ํ™”๋ฉด์œผ๋กœ ์ด๋™ํ•ฉ๋‹ˆ๋‹ค

3. ์Œ์„ฑ ๋ถ„์„ ๊ฒฐ๊ณผ ํ™”๋ฉด

image

  • ๋…น์Œ๋œ ์Œ์„ฑ๊ณผ ํ•™์Šตํ•œ ๋ชจ๋ธ์„ ์ ์šฉํ•˜์—ฌ ๋ชฉ์†Œ๋ฆฌ ํƒ€์ž…์— ๋”ฐ๋ฅธ ๊ฒฐ๊ณผ ํ™”๋ฉด์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค

์ด 8๊ฐœ์˜ ๋ชฉ์†Œ๋ฆฌ ํƒ€์ž…์ด ์žˆ์Šต๋‹ˆ๋‹ค! image

๐Ÿ› ๏ธ About Code

(ํ•ต์‹ฌ ์ฝ”๋“œ์— ๋Œ€ํ•œ ์„ค๋ช… ์ถ”๊ฐ€)

  • ๋‹ค์Œ์€ ํ”„๋กœํ† ํƒ€์ž…์— ์ ์šฉ๋œ ๋ชจ๋ธ ๋ฐ์ดํ„ฐ๋ฅผ ์ˆ˜์ง‘ํ•˜๊ณ  ๋ถ„๋ฅ˜ํ•˜๋Š” ๊ณผ์ •์„ ๋‹ด์€ ๋‚ด์šฉ์ž…๋‹ˆ๋‹ค.๋จผ์ € ๋ชฉ์†Œ๋ฆฌ ๋ฐ์ดํ„ฐ๋ฅผ ์ˆ˜์ง‘ํ•˜์˜€๋Š”๋ฐ, ์ €ํฌ๋Š” ์ตœ๋Œ€ํ•œ ๋น„์Šทํ•œ ๋ถ„๋ฅ˜ ๊ธฐ์ค€์œผ๋กœ, ๋งŽ์€ ๋ชฉ์†Œ๋ฆฌ ๋ฐ์ดํ„ฐ๋ฅผ ์ˆ˜์ง‘ํ•˜๊ธฐ ์œ„ํ•ด ๋Ÿฌ๋„ˆ๋“ค ๋ชฉ์†Œ๋ฆฌ๋ฟ๋งŒ ์•„๋‹ˆ๋ผ, ๋„ค์ด๋ฒ„ ํด๋กœ๋ฐ” ๋”๋น™์—์„œ AI ๋ชฉ์†Œ๋ฆฌ๋ฅผ ์ถ”์ถœํ•˜์˜€๊ณ , ๋ฐฑ๊ทธ๋ผ์šด๋“œ ์†Œ๋ฆฌ๋„ ํ•จ๊ป˜ ์ถ”๊ฐ€ํ•˜์˜€์Šต๋‹ˆ๋‹ค. *์—ฌ๊ธฐ์„œ ๋ฐฑ๊ทธ๋ผ์šด๋“œ ์†Œ๋ฆฌ๋Š” ๊ตฌ๋ณ„ํ•  ์‚ฌ์šด๋“œ๊ฐ€ ์—†์„ ๋•Œ ์ฒ˜๋ฆฌํ•˜๊ธฐ ์œ„ํ•œ ์šฉ๋„์ž…๋‹ˆ๋‹ค.
  • ์ด๋ ‡๊ฒŒ ์ˆ˜์ง‘ํ•œ ๋ฐ์ดํ„ฐ๋Š” ์ด 340๊ฐ€์ง€๋กœ, ์ด๋ฅผ gender์™€ voiceType์œผ๋กœ ๋‚˜๋ˆ„์–ด ๋ถ„๋ฅ˜ํ•˜์˜€์Šต๋‹ˆ๋‹ค Gender๋Š” ๋ง ๊ทธ๋Œ€๋กœ ๋‚จ์ž์™€ ์—ฌ์ž ๋ชฉ์†Œ๋ฆฌ๋ฅผ ๋ถ„๋ฅ˜ํ•˜์—ฌ ๋‹ด์•˜๊ณ , voiceType์€ ๋ชฉ์†Œ๋ฆฌ์˜ ํ†ค์ด๋‚˜ ์–ต์–‘, ๋ถ„์œ„๊ธฐ์— ๋”ฐ๋ผ ๋‚จ์ž 4๊ฐ€์ง€, ์—ฌ์ž 4๊ฐ€์ง€์˜ ๋ชฉ์†Œ๋ฆฌ ์œ ํ˜•์œผ๋กœ ๋ถ„๋ฅ˜ํ•˜์—ฌ ์ด 8๊ฐ€์ง€์˜ ์œ ํ˜•์œผ๋กœ ๊ตฌ๋ถ„ํ•˜์˜€์Šต๋‹ˆ๋‹ค.

image ์ดํ›„ ๋ชจ๋ธ์„ ์•ฑ์— ์ ์šฉํ•  ๋•Œ, ์ฒ˜์Œ์—๋Š” gender ๋ชจ๋ธ ์—†์ด ๋‚จ,์—ฌ๊ฐ€ ๋ชจ๋‘ ํ•ฉ์ณ์ ธ ์žˆ๋Š” voiceType ๋ชจ๋ธ๋กœ ์˜ค๋””์˜ค ๋ฐ์ดํ„ฐ๋ฅผ ๋ถ„์„ํ•˜์˜€๋Š”๋ฐ ์ •ํ™•๋„๊ฐ€ ๋งŽ์ด ๋–จ์–ด์ง€๋Š” ๊ฒƒ์„ ํ™•์ธํ•˜์˜€์Šต๋‹ˆ๋‹ค.. ์˜ˆ๋ฅผ ๋“ค์–ด ์—ฌ์ž๋ชฉ์†Œ๋ฆฌ์—ฌ๋„, ์–ต์–‘์ด ๋‚ฎ์•„์ง€๋Š” ๋ถ€๋ถ„์—์„œ๋Š” ๋‚จ์ž ๋ชฉ์†Œ๋ฆฌ๋กœ ์ธ์‹ ํ•˜๋Š” ๋ฌธ์ œ๊ฐ€ ์žˆ์—ˆ์–ด์š”. ๋”ฐ๋ผ์„œ gender๋ฅผ ๊ตฌ๋ณ„ํ•˜๋Š” ๋ชจ๋ธ๋กœ ์˜ค๋””์˜ค ๋ฐ์ดํ„ฐ๋ฅผ male๊ณผ female๋กœ ๊ตฌ๋ณ„ํ•œ ํ›„์—, ๊ฐ๊ฐ์˜ ์„ฑ๋ณ„์— ๋งž๋Š” voiceType์„ ๊ตฌ๋ณ„ํ•ด์ฃผ๋Š” ๋ชจ๋ธ์„ ์ ์šฉํ•˜์—ฌ ๋ถ„์„ํ•˜๋„๋ก ํ•˜์˜€์Šต๋‹ˆ๋‹ค.

Code snippet

์˜ค๋””์˜ค๋ฅผ ๋ถ„์„ํ•ด์„œ ์–ป๋Š” ๊ฒฐ๊ณผ๊ฐ’์„ ๋ฐ›์„ ์ˆ˜ ์žˆ๋Š” ResultsObserver ๊ตฌํ˜„โฌ‡๏ธ

import Foundation
import SwiftUI
import AVFoundation
import SoundAnalysis

//์˜ค๋””์˜ค ๋ถ„์„์œผ๋กœ ๋ถ€ํ„ฐ ๊ฒฐ๊ณผ ๋ฐ›๋Š” ํƒ€์ž… ๊ตฌํ˜„
/// An observer that receives results from a classify sound request.
class ResultsObserver: NSObject, SNResultsObserving {

@Binding var classificationResult: String

//๊ฐ ์‹œ๊ฐ„ ๋ฒ”์œ„์˜ classification ๊ฒฐ๊ณผ๋ฅผ ์ €์žฅํ•  ๋”•์…”๋„ˆ๋ฆฌ
private var classifications: [String: Double] = [:]

//๊ฐ€์žฅ ๋งŽ์ด ๋„์ถœ๋œ ์˜ค๋””์˜คํƒ€์ž… ๊ฐ’ ๋ฐ›๊ธฐ
var mostClassificationIdentifier: String

//์ดˆ๊ธฐํ™”
init(result: Binding<String>){
       _classificationResult = result
       mostClassificationIdentifier = "" // Provide an initial value
       super.init() // Call super.init() after initializing all properties
   }

/// Notifies the observer when a request generates a prediction.
func request(_ request: SNRequest, didProduce result: SNResult) {

	    // ๊ฒฐ๊ณผ๋ฅผ SNClassificationResult๋กœ๋ถ€ํ„ฐ ๋ฐ›์•„์˜ด
    // SNClassificationResul์€ ์‹œ๊ฐ„ ๋ฒ”์œ„์— ๋Œ€ํ•ด ๊ฐ€์žฅ ๋†’์€ ์ˆœ์œ„์˜ ์˜ˆ์ธก ๊ฒฐ๊ณผ๋ฅผ ๊ฐ€์ ธ์˜จ๋‹ค๊ณ  ํ•จ
    guard let result = result as? SNClassificationResult else  { return }
    
    //classifications๋Š” [SNClassification]์ž„
    //์ด๋Š” ์ƒ์œ„ ๋ถ„๋ฅ˜ ํ›„๋ณด๋ฅผ ๋‚˜์—ดํ•œ ๋ฐฐ์—ด์„ ๋‚˜ํƒ€๋ƒ„
    //classification.first๋‹ˆ๊นŒ ์ ค ๋†’์€ ์ˆœ์œ„์˜ ํ›„๋ณด๋ฅผ ๋ฐ๋ ค์˜ด
    guard let classification = result.classifications.first else { return }
    
    classifications[classification.identifier, default: 0.0] 
    += classification.confidence
    
    //๊ฐ€์žฅ ๋งŽ์ด ๋„์ถœ๋œ classification ๊ฒฐ๊ณผ๊ฐ’์„ ๊ตฌํ•˜๋Š” ๋กœ์ง
    //BG๋Š” ๋”•์…”๋„ˆ๋ฆฌ์—์„œ ์‚ญ์ œ
    classifications["BG"] = nil
    //๋”•์…”๋„ˆ๋ฆฌ ์•ˆ์—์„œ ๊ฐ€์žฅ ๋งŽ์ด ๋‚˜์˜จ ๊ฐ’ ๊ตฌํ•˜๊ธฐ
    let mostFrequentClassification = classifications.max(by: 
    { $0.value < $1.value })
    
    // ๊ฐ€์žฅ ๋งŽ์ด ๋„์ถœ๋œ identifier
    mostClassificationIdentifier = mostFrequentClassification!.key
 }
}

๋ชจ๋ธ์„ ์—ฐ๊ฒฐํ•ด ์˜ค๋””์˜ค๋ฅผ ๋ถ„์„ํ•˜๋Š” ๋ฉ”์„œ๋“œโฌ‡๏ธ

    func classifyAudio(){
    
    let genderAnalyzer = try! SNAudioFileAnalyzer
    (url: audioRecorder.recordedFile!)
    
    let userTypeAnalyzer = try! SNAudioFileAnalyzer
    (url: audioRecorder.recordedFile!)
    
    let request = try! SNClassifySoundRequest
    (mlModel: genderClassifier().model)
    do{
        try? genderAnalyzer.add(request, withObserver: genderObserver)
        try genderAnalyzer.analyze()
    }
    
    gender = genderObserver.mostClassificationIdentifier
    print(gender)
    
    if gender == "male" {
        let request = try! SNClassifySoundRequest
        (mlModel: maleVoiceClassifier().model)
        do{
            try? userTypeAnalyzer.add(request, 
            withObserver: userTypeObserver)
            try userTypeAnalyzer.analyze()
        }
    }
    else if gender == "female" {
        let request = try! SNClassifySoundRequest
        (mlModel: femaleVoiceClassifier().model)
        do{
            try? userTypeAnalyzer.add(request, 
            withObserver: userTypeObserver)
            try userTypeAnalyzer.analyze()
        }
    }
    
    userType = userTypeObserver.mostClassificationIdentifier
}

AVFoundation์„ ์‚ฌ์šฉํ•ด ์˜ค๋””์˜ค๋ฅผ ๋…น์Œํ•˜๋Š” ๊ธฐ๋Šฅ โฌ‡๏ธ

        import Foundation
        import AVFoundation

class AudioRecorder: NSObject, ObservableObject, AVAudioPlayerDelegate {
    // ๋…น์Œ
    var audioRecorder: AVAudioRecorder?
    @Published var isRecording = false
    @Published var isNext = false

// ์žฌ์ƒ
var audioPlayer: AVAudioPlayer?
@Published var isPlaying = false
@Published var isPaused = false

// ์Œ์„ฑ ๋ฉ”๋ชจ๋œ ๋ฐ์ดํ„ฐ
var recordedFile: URL?

// Singleton instance
static let shared = AudioRecorder()

override init() {
    super.init()
    configureAudioSession()
    checkAudioRecordingPermission()
}

func configureAudioSession() {
    let audioSession = AVAudioSession.sharedInstance()
    do {
        try audioSession.setCategory(.playAndRecord, mode: .default)
        try audioSession.setActive(true)
    } catch {
        print("Failed to configure audio session: 
        
        \(error.localizedDescription)")
    }
}

// ์Œ์„ฑ ๋ฉ”๋ชจ ๋…น์Œ ๊ด€๋ จ ๋ฉ”์„œ๋“œ
// ๋ฐ›์•„์˜ฌ ๋•Œ ํ˜•์‹๊ณผ ํ™•์žฅ์ž๋ฅผ 
func startRecording() {
    let fileURL = getDocumentsDirectory()
    .appendingPathComponent("recording-\(Date().timeIntervalSince1970).wav")
    let settings: [String: Any] = [
        AVFormatIDKey: Int(kAudioFormatLinearPCM),
        AVSampleRateKey: 16000.0,
        AVNumberOfChannelsKey: 1, // mono
        AVEncoderAudioQualityKey: AVAudioQuality.high.rawValue,
        AVLinearPCMBitDepthKey: 16,
        AVLinearPCMIsBigEndianKey: false,
        AVLinearPCMIsFloatKey: false
    ]
    
    do {
        audioRecorder = try AVAudioRecorder
        (url: fileURL, settings: settings)
        audioRecorder?.record()
        isRecording = true
        isNext = false
        print("Recording started")
    } catch {
        print("Failed to start recording: \(error.localizedDescription)")
    }
}

func stopRecording() {
    guard let recorder = audioRecorder else {
        print("Audio recorder is not initialized.")
        return
    }
    recorder.stop()
    recordedFile = recorder.url
    isRecording = false
    isNext = true
    print("Recording stopped: \(String(describing: recordedFile))")
}

func getDocumentsDirectory() -> URL {
    let paths = FileManager.default.urls(for: .documentDirectory, 
    in: .userDomainMask)
    return paths[0]
}
 }

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages