Find and extract source text that must be translated.
Usage
find_source(
  path = ".",
  encoding = "UTF-8",
  verbose = getOption("transltr.verbose", TRUE),
  tr = translator(),
  interface = NULL
)
find_source_in_files(
  paths = character(),
  encoding = "UTF-8",
  verbose = getOption("transltr.verbose", TRUE),
  algorithm = algorithms(),
  interface = NULL
)Arguments
- path
- A non-empty and non-NA character string. A path to a directory containing R source scripts. All subdirectories are searched. Files that do not have a - .R, or- .Rprofileextension are skipped.
- encoding
- A non-empty and non-NA character string. The source character encoding. In almost all cases, this should be UTF-8. Other encodings are internally re-encoded to UTF-8 for portability. 
- verbose
- A non-NA logical value. Should progress information be reported? 
- tr
- A - Translatorobject.
- interface
- A - name, a- callobject, or a- NULL. A reference to an alternative (custom) function used to translate text. If a- callobject is passed to- interface, it must be to operator- ::. Calls to method- Translator$translate()are ignored and calls to- interfaceare extracted instead. See Details below.
- paths
- A character vector of non-empty and non-NA values. A set of paths to R source scripts that must be searched. 
- algorithm
- A non-empty and non-NA character string equal to - "sha1", or- "utf8". The algorithm to use when hashing source information for identification purposes.
Value
find_source() returns an R6 object of class
Translator. If an existing Translator
object is passed to tr, it is modified in place and returned.
find_source_in_files() returns a list of Text objects. It may
contain duplicated elements, depending on the extracted contents.
Details
find_source() and find_source_in_files() look for calls to method
Translator$translate() in R scripts and convert them
to Text objects. The former further sets these resulting
objects into a Translator object. See argument tr.
find_source() and find_source_in_files() work on a purely lexical basis.
The source code is parsed but never evaluated (aside from extracted literal
character vectors).
- The underlying - Translatorobject is never evaluated and does not need to exist (placeholders may be used in the source code).
- Only literal character vectors can be passed to arguments of method - Translator$translate().
Interfaces
In some cases, it may not be desirable to call method
Translator$translate() directly. A custom function wrapping
(interfacing) this method may always be used as long as it has the same
signature as method
Translator$translate(). In other words, it must minimally
have two formal arguments: ... and source_lang.
Custom interfaces must be passed to find_source() and
find_source_in_files() for extraction purposes. Since these functions work
on a lexical basis, interfaces can be placeholders in the source code (non-
existent bindings) at the time these functions are called. However, they must
be bound to a function (ultimately) calling Translator$translate()
at runtime.
Custom interfaces are passed to find_source() and find_source_in_files()
as name or call objects in a variety of ways. The most
straightforward way is to use base::quote(). See Examples below.
Methodology
find_source() and find_source_in_files() go through these steps to
extract source text from a single R script.
- It is read with - text_read()and re-encoded to UTF-8 if necessary.
- It is parsed with - parse()and underlying tokens are extracted from parsed expressions with- utils::getParseData().
- Each expression ( - expr) token is converted to language objects with- str2lang(). Parsing errors and invalid expressions are silently skipped.
- Valid - callobjects stemming from step 3 are filtered with- is_source().
- Calls to method - Translator$translate()or to- interfacestemming from step 4 are coerced to- Textobjects with- as_text().
These steps are repeated for each R script. find_source() further merges
all resulting Text objects into a coherent set with merge_texts()
(identical source code is merged into single Text entities).
Extracted character vectors are always normalized for consistency (at step
5). See normalize() for more information.
Limitations
The current version of transltr can only handle literal
character vectors. This means it cannot resolve non-trivial expressions
that depends on a state. All values passed to argument ... of method
Translator$translate() must yield character vectors
(trivially).
Examples
# Create a directory containing dummy R scripts for illustration purposes.
temp_dir   <- file.path(tempdir(TRUE), "find-source")
temp_files <- file.path(temp_dir, c("ex-script-1.R", "ex-script-2.R"))
dir.create(temp_dir, showWarnings = FALSE, recursive = TRUE)
cat(
  "tr$translate('Hello, world!')",
  "tr$translate('Farewell, world!')",
  sep  = "\n",
  file = temp_files[[1L]])
cat(
  "tr$translate('Hello, world!')",
  "tr$translate('Farewell, world!')",
  sep  = "\n",
  file = temp_files[[2L]])
# Extract calls to method Translator$translate().
find_source(temp_dir)
find_source_in_files(temp_files)
# Use custom functions.
# For illustrations purposes, assume the package
# exports an hypothetical translate() function.
cat(
  "translate('Hello, world!')",
  "transtlr::translate('Farewell, world!')",
  sep  = "\n",
  file = temp_files[[1L]])
cat(
  "translate('Hello, world!')",
  "transltr::translate('Farewell, world!')",
  sep  = "\n",
  file = temp_files[[2L]])
# Extract calls to translate() and transltr::translate().
# Since find_source() and find_source_in_files() work on
# a lexical basis, these are always considered to be two
# distinct functions. They also don't need to exist in the
# R session calling find_source() and find_source_in_files().
find_source(temp_dir, interface = quote(translate))
find_source_in_files(temp_files, interface = quote(transltr::translate))