Abstract-Internet-native audio-visual services are witnessing rapid development. Among these services, object-based audiovisual services are gaining importance. In 2014, we established the Software Defined Media (SDM) consortium to target new research areas and markets involving object-based digital media and Internet-by-design audio-visual environments. In this paper, we introduce the SDM architecture that virtualizes networked audio-visual services along with the development of smart buildings and smart cities using Internet of Things (IoT) devices and smart building facilities. Moreover, we design the SDM architecture as a layered architecture to promote the development of innovative applications on the basis of rapid advancements in software-defined networking (SDN). Then, we implement a prototype system based on the architecture, present the system at an exhibition, and provide it as an SDM API to application developers at hackathons. Various types of applications are developed using the API at these events. An evaluation of SDM API access shows that the prototype SDM platform effectively provides 3D audio reproducibility and interactiveness for SDM applications.
HAL is a multidisciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L'archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d'enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.
In addition to traditional viewing media, metadata that record the physical space from multiple perspectives will become extremely important in realizing interactive applications such as Virtual Reality (VR) and Augmented Reality (AR). This paper proposes the Software Defined Media (SDM) Ontology designed to describe spatio-temporal media and the systems that handle them comprehensively. Spatio-temporal media refers to video, audio, and various sensor values recorded together with time and location information. The SDM Ontology can flexibly and precisely represent spatio-temporal media, equipment, and functions that record, process, edit, and play them, as well as related semantic information. In addition, we recorded classical and jazz concerts using many video cameras and audio microphones, and then processed and edited the video and audio data with related metadata. Then, we created a dataset using the SDM Ontology and published it as linked open data (LOD). Furthermore, we developed “Web3602”, an application that enables users to interactively view and experience 360∘ video and spatial acoustic sounds by referring to this dataset. We conducted a subjective evaluation by using a user questionnaire. Web3602 is a data-driven web application that obtains video and audio data and related metadata by querying the dataset.
In order to analyze the motion of motorcycles and other vehicles, it is desirable to measure and record data using a combination of data loggers and video and audio recording devices and accurately integrate the collected data. For this purpose, it is necessary to be able to record the position and time at which data was recorded with sufficient accuracy. In recent years, it has become relatively easy to determine the recorded position with an accuracy on the order of cm by using high-precision satellite positioning such as RTK-GNSS. On the other hand, determining the position of a moving object from data timestamps with an accuracy of cm order often requires a time accuracy of 1 millisecond or better. The time stamps used in conventional data loggers, general video cameras, smartphones, and other motion sensors and video recording functions cannot easily achieve a time accuracy of 1 millisecond due to limitations such as the accuracy of the built-in clock generator and the effects of communication delays related to synchronization. The accuracy of the built-in clock generators of consumer-use equipment may have an error on the order of 100 ppm due to various factors. This error cannot be ignored, as even a recording of only 10 seconds cannot guarantee a time accuracy of 1 millisecond. The problem is further complicated when software is involved. In particular, there is a video camera in which the video frame interval fluctuates on the order of several tens of percent due to load fluctuations in the video encoding. A calibration signal for synchronization is used to guarantee the time accuracy of data loggers. In professional equipment, high-precision time management is performed by SMPTE timecode (SMPTE12M-1, 2008) and IEEE1588, etc. However, it is not common in consumer equipment. In general, synchronizing data loggers in remote locations or between different media, such as inertial motion and video/audio recording, is not easy. This paper proposes a method and system for time synchronization between independent data loggers, motion sensors, and different media, such as video and audio, within a 1-millisecond error. The proposed method takes advantage of the fact that the one-second timing pulses (PPS signals) generated by typical GNSS receivers have accuracy on the order of 100 nanoseconds for synchronization purposes. The proposed system achieves synchronization by embedding PPS signals in an inertial measurement unit (IMU) and by displaying LEDs synchronized with PPS signals in the camera images. Figure 1 shows a block diagram of the main parts of the prototype experimental system with the motorcycle. A logger collects ob- servation data from GNSS, IMU, and vehicle information, and a video camera records video images. The GNSS receiver connected to the logger receives correction information via a server from the electronic reference point. It performs satellite positioning with centimeter-level positional accuracy and nanosecond-level time accuracy using the RTK method. The IMU device used in the proposed system, simultaneously records PPS signals at the sampling timing and can identify the time with the accuracy of the IMU sampling frequency. Using our team’s previous work (Ando et al., 2020), we can also identify the time with even higher accuracy by analyzing the IMU data over a longer period. This method takes advantage of the stable and constant interval nature of IMU sampling. In the data evaluation with a sampling frequency of 1 kHz, the time was identified with an average error of 0.136 milliseconds. Although some video cameras can input an electrical signal for time synchronization, the timing of data encoding and the computational load may cause a discrepancy with the recorded time. Therefore, the proposed method employs a method in which a synchronizing LED is projected in the actual video image. Suppose the frame rate during video recording is stable and constant. In that case, the shooting time can be accurately determined by analyzing the image of a single LED lit by the PPS signal using the same method as in the IMU case described above. In our evaluation using an action camera with a frame rate of 240 fps and a shutter speed of 1/3840 seconds, we have confirmed that the time is determined with an accuracy of about 0.4 milliseconds. In the proposed system, the GNSS LED Beacon shown in Figure 2 was developed with multiple LEDs to enable highly accurate time synchronization even with video cameras whose frame rate is not stable. Figure 3 shows the outline of the LED matrix drive circuit. The 8x8 LED matrix is lit by a combination of time signals (10 bits) in 1-millisecond increments generated by resetting a counter with a PPS signal, which is fed by a 16 kHz asynchronous clock input. It is possible to read the time with up to 1 millisecond accuracy directly from the video image taken by the GNSS LED beacon. The time can be read directly from the video image of the GNSS LED beacon with an accuracy of up to 1 millisecond. The snapshot in Figure 4 shows the GNSS LED Beacon attached to the handlebars of the motorcycle and the 360-degree camera capturing driving images. Figure 5 is a screenshot of the operation logs of the brake (red) and turn signal (yellow) as the trajectories shifted 30 cm above and to the left and right sides of the motorcycle’s travel path and displayed on GoogleEarth.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.