Transcription Project Files

You must read all the contents of this web (from the beginning to the end) 

Software Name:  Transcriber 

Transcriber Download Link:  https://sourceforge.net/projects/trans/files/transcriber/1.5.1/

Transcriber Video Tutorial (YouTube):  https://youtu.be/lWb8L9DdwnA

Please download the following files:

1.  The Transcriber (the software to do the project)

2.  The Video Tutorial on how to use the software 

3.  The Test (an audio file containing the speaking that you need to type what it says), 

4.   The Guidelines (in PDF, on how to type or transcribe each audio file)

5.   The Transcriber's Manual (in PDF, on how to use the Transcriber in typing it).


In order to pass the test, please learn and master:

1.  The guidelines

2.  The Transcriber's Manual

3. The Video Tutorial

4.  Additional guidelines (see at the end of this webpage below: "Additional Client Directives")

5.  The Sample Test Criterion (on How to Pass the Test)

6.  Common Mistakes to Avoid (so that you can avoid making mistakes)

If you fail one of them, you will fail the test. 

You also have to obey the 6 (six) if you want your project files to be approved and get paid.


-------------------------------------------------------------------------------------------------------------------------------------------------------------------

Indonesian Transcription

Audio File (the test: do 10 minutes as test):  

Transcriber's Manual: https://drive.google.com/file/d/1ocbWm8J1X18a7MhPvtieNTagd9j_VVMP/view

Transcription Guidelines:  https://drive.google.com/file/d/1_2py605YFd37EvpKbg6U4A15Up1sEf3x/view



PROJECT FILES (only for selected people who have passed the test):

The First 4 audio files:  

File 0462:   https://drive.google.com/file/d/1DuXaWRS_g8VKVBLIhRCy4fgIEjtK_iGj/view

File 0464:   https://drive.google.com/file/d/1orlrds5dpOb79Eye6-EeoFYe3NOoPdL-/view

File 0465:   https://drive.google.com/file/d/1PQRKGc3uhjbjff91cMtpdCjWn3RyJSx_/view

File 0467:   https://drive.google.com/file/d/1BxE5P-ZcJRSGHlK6wABr5DISTTQgmoom/view


The Second 4 audio files:

File 054:  https://drive.google.com/file/d/17NZUsqQuBNbCTTDszvo_MtyUpiyGNA10/view

File 055:  https://drive.google.com/file/d/1iZZPGerLHrLqG5Lyb7zPjSNlAJKGtmb3/view

File 056 https://drive.google.com/file/d/1XMnDgJ5TsODTFbXw9WoxJokiIvV_ur3O/view

File 058 https://drive.google.com/file/d/114ycGFqoDjEfBp5ouWtsgRhuHiAxcRbH/view


------------------------------------------------------------------------------------------------------------

Taiwanese Transcription

Audio File (the test):  https://drive.google.com/file/d/19xIM1XiYWBfJjB8pFbGmGmv0OF_mVNpG/view

Transcription Guidelines: https://drive.google.com/file/d/1_4Z9Mfr-a3g8TDxbluqmuyKuJ3v_P-Jl/view

Transcriber's Manual:  https://drive.google.com/file/d/13DbY0mPXLLc4MtqYRYaho8nqi_4Bfk-m/view


PROJECT FILES (only for selected people who have passed the test):

The first ....(waiting for you to pass the test)...  files in a zip file: 

----------------------------------------------------------------------------------------------------------

Japanese Transcription

The Zip file (containing the 3 files: the Sample Test file, Guidelines and Transcriber Manual):  

https://drive.google.com/file/d/1g45z5TjnEYTQDjFpktRZiiIn99Fu4Wkx/view (Google Drive)


PROJECT FILES (only for selected people who have passed the test):

The first ........(waiting for you to pass the test)... files in a zip file: 

-----------------------------------------------------------------------------------------------------------

Please be sure the sample test meets the following criterion (to be accepted): 

1) Is done using Transcriber with proper segmentation.  Segments can not be longer than 15 seconds, but you can use commas and periods so that you don’t have to make them too short either to save time.


2) The transcription is accurate and full verbatim. 


3) That all guidelines are carefully followed with special attention to proper tagging and labeling. 



------------------------------------------------------------------------------------------------------------------------------ 
Additional Notes - Common Mistakes People are Making
  • Do not make up your own labels or tags, only use those listed explicitly in the Guidelines.
  • Please note that all tags have a white space around the content except for double parenthesis (()). For example:<initial> IBM </initial> and NOT <initial>IBM</initial>
  • If you're unsure, you can use a catch-all label like [noise] or [no-speech], just be sure not to put the wrong sound or make up a label.
  • Please be sure to encode .trs files in UTF-8. See the yellow highlighted text on page 2 of the Transcriber manual.
    The correct header should be:
    <?xml version="1.0" encoding="UTF-8"?>If you drag and drop a .trs file into Chrome or open it in a text editor, you can see the Transcriber tags in raw form to verify.
  • Don't forget to include sound tags. There are 15 in total with 5 ([no-speech], [noise], [overlap], [music], [applause]) being very important to include. If you are unsure about a sound use [noise]. The remaining 10 can be found in section 2.3.1 of the Guidelines.
  • Pro tip: It is generally faster to completely segment the audio and create the speakers first and then to go back and transcribe after this framework is in place.
  • Pro tip: Drag and drop your .trs file into Firefox browser for proofreading. This makes it much easier to spot errors with tags, spacing and orthography.
-----------------------------------------------------------------------------------------------------------

For Indonesian Transcription:


Additional Client Directives

[overlap]

Our client just added some clarification to the overlap tag and a new guideline that we need to implement going forward.
  • Case 1: If there is overlap but there is one dominant speaker, just transcribe the dominant speaker and ignore the other speaker.
  • Case 2: If there is overlap for less than 3 seconds and there is no dominant speaker, you can insert the [overlap] tag within the same segment and refrain from transcribing the overlapping speech. For example, say there's a segment where Speaker A is talking from 00:15 - 00:30. You label the speaker as Speaker A. But then, Speaker B chimes in with overlapping speech from 00:20-00:22 (for 2 seconds), while Speaker A is still talking. Because this overlapping speech is short, you do not need to create an isolated segment just for the [overlap]. You will insert the tag [overlap] for these 2 seconds within the same Speaker A segment. This will save you time because you do not need to create a new segment for short interruptions.
  • Case 3: If there is overlap and there is no dominant speaker AND the overlapping speech is more than 3 seconds long, please put [overlap] alone in its own segment and name the speaker for that segment “multiple”. Please see the example below.