Spydus Help
Electronic Resource Management (ERM) / Digital Assets / Digital Assets - Azure Cognitive Services
In This Topic
    Digital Assets - Azure Cognitive Services
    In This Topic
    This functionality requires additional commissioning and a fee applies. Please contact your Civica Account Manager for more details.

    If additional Azure services have been purchased as part of a library's Spydus contract, Azure Image Analysis will be automatically performed on will be performed on files loaded into the ERM Digital Assets module - whether loaded individually or in bulk.

    Image Analysis can also be manually performed by selecting the desired records in the Folder or List interface (or individually at the Edit Record interface) by clicking the Analyse button in the action bar.

    Utilising Azure's Speech-to-text API, transcription can be manually performed at the Edit Record interface by clicking the Transcript drop-down menu, then clicking Generate. Speech-to-text transcription can be performed on the following audio file types:

    Image Analysis

    Spydus uses the Microsoft Computer Vision API to extract metadata from images using artificial intelligence, and will automatically add the tags and flags detected in the images if the confidence score threshold configured in Digital Assets General Parameters is met or exceeded.

    Image analysis may be run manually after a Digital Asset has been uploaded, either;

    Subject categories 

    The Subject categories applied by Azure are a limited taxonomic categorisation. The chart below shows the '86-category concept' categories that may be automatically added via image analysis.

    images/ERM_DA_AZURE_SUBJ_CAT_TXN_thumb.png

    Click image to enlarge

    If the confidence score of the API for a Subject category exceeds the threshold set in Digital Assets General Parameters, then that category will be automatically applied.  

    Though the Subject categories automatically applied by the API are limited, using the Edit Record interface, custom categories may be added.

    Adult, Racy and Gory content

    The following content types can be detected and flagged by the Image Analysis API (per Microsoft's Detect adult content page):

    If the confidence score of the API for an Adult, Racy or Gory flag exceeds the threshold set in Digital Assets General Parameters, then that flag will be automatically applied.

    These flags only identify the content as being present in images. If it is desired to restrict such images from display, it is recommended to apply the Suppress from OPAC display flag.

    Content tags

    Content tagging is a feature of Computer Vision's Analyze Image operation. Per Microsoft's page on Applying content tags to images (emphasis added):

    "Computer Vision returns tags based on thousands of recognizable objects, living beings, scenery, and actions. When tags are ambiguous or not common knowledge, the API response provides 'hints' to clarify the meaning of the tag in context of a known setting. Tags are not organized as a taxonomy and no inheritance hierarchies exist."

    If the confidence score of the API for a Content tag exceeds the threshold set in Digital Assets General Parameters, then that tag will be automatically applied.

    Description

    Per Microsoft's page on Applying content tags to images, "A collection of content tags forms the foundation for an image 'description' displayed as human readable language formatted in complete sentences."

    In other words, the Content tags applied to the image will be parsed by the API to form a Description. The Description has an independent confidence threshold configured in Digital Assets General Parameters, so may include or exclude tags different to those applied to the image based on the Content tags confidence threshold.

    Audio Transcription

    With the purchase of additional Azure Cognitive Services in the Digital Assets module comes the ability to automatically transcribe audio files to text files. The transcribed text files will be linked to the Digital Asset record, and can subsequently be edited and re-uploaded to correct any inaccuracies, or add annotations.

    A text extract - labeled Document Text - may be viewed in-browser will also be linked to the DA record at the Full Display, but the extract will not contain all of the detail of full transcription (e.g. identification of multiple speakers). Click the View Document Text link to view this extract. This extract will be fully indexed, allowing users to query Spydus for the content of an audio file as well the meta-tags.

    The Transcript drop-down menu will appear if the file is a supported audio file format (.WAV, .MP3, .OGG, .FLAC).

    Generate

    Click the Transcript button, the click the Generate option to start the transcription process in Azure. Depending on the size and length of the file, transcription can sometimes take a significant amount of time. Spydus will show a pop-up dialog box to indicate that transcription has begun. Once transcription has completed, a message will be sent to the Spydus inbox of the user who initiated the transcription.

    Download

    Once transcription has been completed, the Download option will become available. Use this option to download a local copy of the transcription as a text (.TXT) file. This file can then be edited if required, and uploaded using the Upload option.

    Upload

    If there is an existing text transcription that staff would prefer to use instead of using the Azure transcription service, click the Upload option in the drop-down menu. This option can also be used to replace a transcribed file that has been reviewed and edited.

    Delete

    If there is a transcription file linked to a Digital Asset record, it can be removed using the Delete option. Please note that the deletion of a transcribed file is permanent.