Encoding Text

Encoding Text#

Consider the following examples of movie reviews:

Example 1

Review 1A (negative)
I thought the movie would be great, but it is not. The script is weak, the pacing is slow, and the ending feels pointless.
Review 1B (positive)
I thought the movie would not be great, but it is. The script is strong, the pacing is brisk, and the ending feels meaningful.

Example 2

Review 2A (negative)
The acting is decent, but the plot is predictable and the jokes fall flat.
Review 2B (positive)
The plot is predictable and the jokes fall flat, but the acting is decent.

Example 3

Review 3A (negative)
For the first ninety minutes I waited for something exciting to happen. Spoiler: it never does.
Review 3B (positive)
For the first ninety minutes I waited for something exciting to happen, and when it finally does it is worth every second.

A problem with a bag of words representation treats each review as the same multiset of words: $$ vector=(acting:1, plot:1, predictable:1, but:1, \ldots). $$

Because position is discarded, the classifier cannot learn that “not” negates what follows or that the clause after “but” usually carries the main sentiment.

import numpy as np
from tensorflow.keras.layers import Embedding

# 1. Toy vocabulary: 5 tokens
#    0 = <PAD>, 1 = "good", 2 = "bad", 3 = "not", 4 = "movie"
vocab_size   = 5
embed_dim    = 3   # just three numbers per word
sequence_len = 4

embed = Embedding(input_dim=vocab_size,
                  output_dim=embed_dim,
                  input_length=sequence_len)

# 2. Example review, integer‑encoded and padded:
#    "not good movie"   →   [3, 1, 4, 0]
sample = np.array([[3, 1, 4, 0]])        # shape (1, 4)

dense_seq = embed(sample)                # shape (1, 4, 3)
print(dense_seq.numpy().round(2))

2025-05-08 15:06:44.682838: I external/local_xla/xla/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.
2025-05-08 15:06:44.686025: I external/local_xla/xla/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.
2025-05-08 15:06:44.694503: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:467] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
E0000 00:00:1746716804.708111   74993 cuda_dnn.cc:8579] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1746716804.712204   74993 cuda_blas.cc:1407] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
W0000 00:00:1746716804.723628   74993 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once.
W0000 00:00:1746716804.723638   74993 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once.
W0000 00:00:1746716804.723640   74993 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once.
W0000 00:00:1746716804.723641   74993 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once.
2025-05-08 15:06:44.727530: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.

[[[ 0.02  0.03 -0.02]
  [-0.05  0.04 -0.03]
  [ 0.04  0.03  0.05]
  [ 0.   -0.01  0.03]]]

/opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/keras/src/layers/core/embedding.py:90: UserWarning: Argument `input_length` is deprecated. Just remove it.
  warnings.warn(
2025-05-08 15:06:47.093900: E external/local_xla/xla/stream_executor/cuda/cuda_platform.cc:51] failed call to cuInit: INTERNAL: CUDA error: Failed call to cuInit: UNKNOWN ERROR (303)

Word Embeddings#

Unlike bag‑of‑words, use dense vectors that are trainable and will gradually move so that “good” and “bad” point in different directions, while “not” can flip the meaning when a sequential layer (CNN, RNN) reads the tokens in order.

import numpy as np
from tensorflow.keras.layers import Embedding

# 1. Toy vocabulary: 5 tokens
#    0 = <PAD>, 1 = "good", 2 = "bad", 3 = "not", 4 = "movie"
vocab_size   = 6
embed_dim    = 2   # just three numbers per word
sequence_len = 4

embed = Embedding(input_dim=vocab_size,
                  output_dim=embed_dim,
                  input_length=sequence_len)

# 2. Example review, integer‑encoded and padded:
#    "not good movie"   →   [3, 1, 4, 0]
sample = np.array([[3, 1, 5, 3]])        # shape (1, 4)

dense_seq = embed(sample)                # shape (1, 4, 3)
print(dense_seq.numpy().round(2))

[[[ 0.02  0.  ]
  [-0.03  0.01]
  [ 0.05  0.02]
  [ 0.02  0.  ]]]

import tensorflow as tf

Using a CNN#

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers

# ------------------------------------------------------------------
# 1. Data loading and preprocessing
# ------------------------------------------------------------------
max_features = 20_000      # keep the 20 000 most frequent words
maxlen        = 400        # cut / pad every review to 400 tokens

(x_train, y_train), (x_test, y_test) = keras.datasets.imdb.load_data(
    num_words=max_features
)

x_train = keras.preprocessing.sequence.pad_sequences(x_train, maxlen=maxlen)
x_test  = keras.preprocessing.sequence.pad_sequences(x_test,  maxlen=maxlen)

# ------------------------------------------------------------------
# 2. CNN model
# ------------------------------------------------------------------
model = keras.Sequential([
    layers.Embedding(max_features, 128, input_length=maxlen),

    layers.Conv1D(64, 7, activation="relu"),
    layers.MaxPooling1D(3),

    layers.Conv1D(64, 7, activation="relu"),
    layers.GlobalMaxPooling1D(),

    layers.Dense(1, activation="sigmoid")
])

model.compile(
    optimizer="adam",
    loss="binary_crossentropy",
    metrics=["accuracy"]
)

model.summary()

# ------------------------------------------------------------------
# 3. Training
# ------------------------------------------------------------------
history = model.fit(
    x_train,
    y_train,
    epochs=8,
    batch_size=128,
    validation_split=0.2,
    verbose=0
)

# ------------------------------------------------------------------
# 4. Evaluation
# ------------------------------------------------------------------
test_loss, test_acc = model.evaluate(x_test, y_test, verbose=0)
print(f"Test accuracy: {test_acc:.3f}")

# ------------------------------------------------------------------
# 5. Predictions (first ten test examples, rounded for readability)
# ------------------------------------------------------------------
preds = model.predict(x_test[:10]).round(3).squeeze()
print("Predicted probabilities:", preds)

Model: "sequential"

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ embedding_2 (Embedding)         │ ?                      │   0 (unbuilt) │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ conv1d (Conv1D)                 │ ?                      │   0 (unbuilt) │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ max_pooling1d (MaxPooling1D)    │ ?                      │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ conv1d_1 (Conv1D)               │ ?                      │   0 (unbuilt) │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ global_max_pooling1d            │ ?                      │             0 │
│ (GlobalMaxPooling1D)            │                        │               │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense (Dense)                   │ ?                      │   0 (unbuilt) │
└─────────────────────────────────┴────────────────────────┴───────────────┘

 Total params: 0 (0.00 B)

 Trainable params: 0 (0.00 B)

 Non-trainable params: 0 (0.00 B)

---------------------------------------------------------------------------
KeyboardInterrupt                         Traceback (most recent call last)
Cell In[4], line 44
model.summary()
# ------------------------------------------------------------------
# 3. Training
# ------------------------------------------------------------------
---> 44 history = model.fit(
   x_train,
   y_train,
   epochs=8,
   batch_size=128,
   validation_split=0.2,
   verbose=0
)
# ------------------------------------------------------------------
# 4. Evaluation
# ------------------------------------------------------------------
test_loss, test_acc = model.evaluate(x_test, y_test, verbose=0)

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/keras/src/utils/traceback_utils.py:117, in filter_traceback.<locals>.error_handler(*args, **kwargs)
filtered_tb = None
try:
--> 117     return fn(*args, **kwargs)
except Exception as e:
   filtered_tb = _process_traceback_frames(e.__traceback__)

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/keras/src/backend/tensorflow/trainer.py:395, in TensorFlowTrainer.fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, validation_batch_size, validation_freq)
if getattr(self, "_eval_epoch_iterator", None) is None:
   self._eval_epoch_iterator = TFEpochIterator(
       x=val_x,
       y=val_y,
   (...)    393         shuffle=False,
   )
--> 395 val_logs = self.evaluate(
   x=val_x,
   y=val_y,
   sample_weight=val_sample_weight,
   batch_size=validation_batch_size or batch_size,
   steps=validation_steps,
   callbacks=callbacks,
   return_dict=True,
   _use_cached_eval_dataset=True,
)
val_logs = {
   "val_" + name: val for name, val in val_logs.items()
}
epoch_logs.update(val_logs)

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/keras/src/utils/traceback_utils.py:117, in filter_traceback.<locals>.error_handler(*args, **kwargs)
filtered_tb = None
try:
--> 117     return fn(*args, **kwargs)
except Exception as e:
   filtered_tb = _process_traceback_frames(e.__traceback__)

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/keras/src/backend/tensorflow/trainer.py:483, in TensorFlowTrainer.evaluate(self, x, y, batch_size, verbose, sample_weight, steps, callbacks, return_dict, **kwargs)
for step, iterator in epoch_iterator:
   callbacks.on_test_batch_begin(step)
--> 483     logs = self.test_function(iterator)
   callbacks.on_test_batch_end(step, logs)
   if self.stop_evaluating:

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/keras/src/backend/tensorflow/trainer.py:219, in TensorFlowTrainer._make_function.<locals>.function(iterator)
def function(iterator):
   if isinstance(
       iterator, (tf.data.Iterator, tf.distribute.DistributedIterator)
   ):
--> 219         opt_outputs = multi_step_on_iterator(iterator)
       if not opt_outputs.has_value():
           raise StopIteration

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/util/traceback_utils.py:150, in filter_traceback.<locals>.error_handler(*args, **kwargs)
filtered_tb = None
try:
--> 150   return fn(*args, **kwargs)
except Exception as e:
 filtered_tb = _process_traceback_frames(e.__traceback__)

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/polymorphic_function.py:833, in Function.__call__(self, *args, **kwds)
compiler = "xla" if self._jit_compile else "nonXla"
with OptionalXlaContext(self._jit_compile):
--> 833   result = self._call(*args, **kwds)
new_tracing_count = self.experimental_get_tracing_count()
without_tracing = (tracing_count == new_tracing_count)

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/polymorphic_function.py:889, in Function._call(self, *args, **kwds)
try:
 # This is the first call of __call__, so we have to initialize.
 initializers = []
--> 889   self._initialize(args, kwds, add_initializers_to=initializers)
finally:
 # At this point we know that the initialization is complete (or less
 # interestingly an exception was raised) so we no longer need a lock.
 self._lock.release()

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/polymorphic_function.py:696, in Function._initialize(self, args, kwds, add_initializers_to)
self._variable_creation_config = self._generate_scoped_tracing_options(
   variable_capturing_scope,
   tracing_compilation.ScopeType.VARIABLE_CREATION,
)
# Force the definition of the function for these arguments
--> 696 self._concrete_variable_creation_fn = tracing_compilation.trace_function(
   args, kwds, self._variable_creation_config
)
def invalid_creator_scope(*unused_args, **unused_kwds):
 """Disables variable creation."""

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/tracing_compilation.py:178, in trace_function(args, kwargs, tracing_options)
   args = tracing_options.input_signature
   kwargs = {}
--> 178   concrete_function = _maybe_define_function(
     args, kwargs, tracing_options
 )
if not tracing_options.bind_graph_to_function:
 concrete_function._garbage_collector.release()  # pylint: disable=protected-access

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/tracing_compilation.py:283, in _maybe_define_function(args, kwargs, tracing_options)
else:
 target_func_type = lookup_func_type
--> 283 concrete_function = _create_concrete_function(
   target_func_type, lookup_func_context, func_graph, tracing_options
)
if tracing_options.function_cache is not None:
 tracing_options.function_cache.add(
     concrete_function, current_func_context
 )

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/tracing_compilation.py:310, in _create_concrete_function(function_type, type_context, func_graph, tracing_options)
 placeholder_bound_args = function_type.placeholder_arguments(
     placeholder_context
 )
disable_acd = tracing_options.attributes and tracing_options.attributes.get(
   attributes_lib.DISABLE_ACD, False
)
--> 310 traced_func_graph = func_graph_module.func_graph_from_py_func(
   tracing_options.name,
   tracing_options.python_function,
   placeholder_bound_args.args,
   placeholder_bound_args.kwargs,
   None,
   func_graph=func_graph,
   add_control_dependencies=not disable_acd,
   arg_names=function_type_utils.to_arg_names(function_type),
   create_placeholders=False,
)
transform.apply_func_graph_transforms(traced_func_graph)
graph_capture_container = traced_func_graph.function_captures

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/framework/func_graph.py:1060, in func_graph_from_py_func(name, python_func, args, kwargs, signature, func_graph, add_control_dependencies, arg_names, op_return_value, collections, capture_by_value, create_placeholders)
 return x
_, original_func = tf_decorator.unwrap(python_func)
-> 1060 func_outputs = python_func(*func_args, **func_kwargs)
# invariant: `func_outputs` contains only Tensors, CompositeTensors,
# TensorArrays and `None`s.
func_outputs = variable_utils.convert_variables_to_tensors(func_outputs)

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/polymorphic_function.py:599, in Function._generate_scoped_tracing_options.<locals>.wrapped_fn(*args, **kwds)
with default_graph._variable_creator_scope(scope, priority=50):  # pylint: disable=protected-access
 # __wrapped__ allows AutoGraph to swap in a converted function. We give
 # the function a weak reference to itself to avoid a reference cycle.
 with OptionalXlaContext(compile_with_xla):
--> 599     out = weak_wrapped_fn().__wrapped__(*args, **kwds)
 return out

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/autograph_util.py:41, in py_func_from_autograph.<locals>.autograph_handler(*args, **kwargs)
"""Calls a converted version of original_func."""
try:
---> 41   return api.converted_call(
     original_func,
     args,
     kwargs,
     options=converter.ConversionOptions(
         recursive=True,
         optional_features=autograph_options,
         user_requested=True,
     ))
except Exception as e:  # pylint:disable=broad-except
 if hasattr(e, "ag_error_metadata"):

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/autograph/impl/api.py:339, in converted_call(f, args, kwargs, caller_fn_scope, options)
if is_autograph_artifact(f):
 logging.log(2, 'Permanently allowed: %s: AutoGraph artifact', f)
--> 339   return _call_unconverted(f, args, kwargs, options)
# If this is a partial, unwrap it and redo all the checks.
if isinstance(f, functools.partial):

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/autograph/impl/api.py:459, in _call_unconverted(f, args, kwargs, options, update_cache)
 return f.__self__.call(args, kwargs)
if kwargs is not None:
--> 459   return f(*args, **kwargs)
return f(*args)

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/autograph/impl/api.py:643, in do_not_convert.<locals>.wrapper(*args, **kwargs)
def wrapper(*args, **kwargs):
 with ag_ctx.ControlStatusCtx(status=ag_ctx.Status.DISABLED):
--> 643     return func(*args, **kwargs)

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/keras/src/backend/tensorflow/trainer.py:132, in TensorFlowTrainer._make_function.<locals>.multi_step_on_iterator(iterator)
@tf.autograph.experimental.do_not_convert
def multi_step_on_iterator(iterator):
   if self.steps_per_execution == 1:
       return tf.experimental.Optional.from_value(
--> 132             one_step_on_data(iterator.get_next())
       )
   # the spec is set lazily during the tracing of `tf.while_loop`
   empty_outputs = tf.experimental.Optional.empty(None)

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/util/traceback_utils.py:150, in filter_traceback.<locals>.error_handler(*args, **kwargs)
filtered_tb = None
try:
--> 150   return fn(*args, **kwargs)
except Exception as e:
 filtered_tb = _process_traceback_frames(e.__traceback__)

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/polymorphic_function.py:833, in Function.__call__(self, *args, **kwds)
compiler = "xla" if self._jit_compile else "nonXla"
with OptionalXlaContext(self._jit_compile):
--> 833   result = self._call(*args, **kwds)
new_tracing_count = self.experimental_get_tracing_count()
without_tracing = (tracing_count == new_tracing_count)

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/polymorphic_function.py:889, in Function._call(self, *args, **kwds)
try:
 # This is the first call of __call__, so we have to initialize.
 initializers = []
--> 889   self._initialize(args, kwds, add_initializers_to=initializers)
finally:
 # At this point we know that the initialization is complete (or less
 # interestingly an exception was raised) so we no longer need a lock.
 self._lock.release()

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/polymorphic_function.py:696, in Function._initialize(self, args, kwds, add_initializers_to)
self._variable_creation_config = self._generate_scoped_tracing_options(
   variable_capturing_scope,
   tracing_compilation.ScopeType.VARIABLE_CREATION,
)
# Force the definition of the function for these arguments
--> 696 self._concrete_variable_creation_fn = tracing_compilation.trace_function(
   args, kwds, self._variable_creation_config
)
def invalid_creator_scope(*unused_args, **unused_kwds):
 """Disables variable creation."""

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/tracing_compilation.py:178, in trace_function(args, kwargs, tracing_options)
   args = tracing_options.input_signature
   kwargs = {}
--> 178   concrete_function = _maybe_define_function(
     args, kwargs, tracing_options
 )
if not tracing_options.bind_graph_to_function:
 concrete_function._garbage_collector.release()  # pylint: disable=protected-access

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/tracing_compilation.py:283, in _maybe_define_function(args, kwargs, tracing_options)
else:
 target_func_type = lookup_func_type
--> 283 concrete_function = _create_concrete_function(
   target_func_type, lookup_func_context, func_graph, tracing_options
)
if tracing_options.function_cache is not None:
 tracing_options.function_cache.add(
     concrete_function, current_func_context
 )

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/tracing_compilation.py:310, in _create_concrete_function(function_type, type_context, func_graph, tracing_options)
 placeholder_bound_args = function_type.placeholder_arguments(
     placeholder_context
 )
disable_acd = tracing_options.attributes and tracing_options.attributes.get(
   attributes_lib.DISABLE_ACD, False
)
--> 310 traced_func_graph = func_graph_module.func_graph_from_py_func(
   tracing_options.name,
   tracing_options.python_function,
   placeholder_bound_args.args,
   placeholder_bound_args.kwargs,
   None,
   func_graph=func_graph,
   add_control_dependencies=not disable_acd,
   arg_names=function_type_utils.to_arg_names(function_type),
   create_placeholders=False,
)
transform.apply_func_graph_transforms(traced_func_graph)
graph_capture_container = traced_func_graph.function_captures

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/framework/func_graph.py:1060, in func_graph_from_py_func(name, python_func, args, kwargs, signature, func_graph, add_control_dependencies, arg_names, op_return_value, collections, capture_by_value, create_placeholders)
 return x
_, original_func = tf_decorator.unwrap(python_func)
-> 1060 func_outputs = python_func(*func_args, **func_kwargs)
# invariant: `func_outputs` contains only Tensors, CompositeTensors,
# TensorArrays and `None`s.
func_outputs = variable_utils.convert_variables_to_tensors(func_outputs)

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/polymorphic_function.py:599, in Function._generate_scoped_tracing_options.<locals>.wrapped_fn(*args, **kwds)
with default_graph._variable_creator_scope(scope, priority=50):  # pylint: disable=protected-access
 # __wrapped__ allows AutoGraph to swap in a converted function. We give
 # the function a weak reference to itself to avoid a reference cycle.
 with OptionalXlaContext(compile_with_xla):
--> 599     out = weak_wrapped_fn().__wrapped__(*args, **kwds)
 return out

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/autograph_util.py:41, in py_func_from_autograph.<locals>.autograph_handler(*args, **kwargs)
"""Calls a converted version of original_func."""
try:
---> 41   return api.converted_call(
     original_func,
     args,
     kwargs,
     options=converter.ConversionOptions(
         recursive=True,
         optional_features=autograph_options,
         user_requested=True,
     ))
except Exception as e:  # pylint:disable=broad-except
 if hasattr(e, "ag_error_metadata"):

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/autograph/impl/api.py:339, in converted_call(f, args, kwargs, caller_fn_scope, options)
if is_autograph_artifact(f):
 logging.log(2, 'Permanently allowed: %s: AutoGraph artifact', f)
--> 339   return _call_unconverted(f, args, kwargs, options)
# If this is a partial, unwrap it and redo all the checks.
if isinstance(f, functools.partial):

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/autograph/impl/api.py:459, in _call_unconverted(f, args, kwargs, options, update_cache)
 return f.__self__.call(args, kwargs)
if kwargs is not None:
--> 459   return f(*args, **kwargs)
return f(*args)

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/autograph/impl/api.py:643, in do_not_convert.<locals>.wrapper(*args, **kwargs)
def wrapper(*args, **kwargs):
 with ag_ctx.ControlStatusCtx(status=ag_ctx.Status.DISABLED):
--> 643     return func(*args, **kwargs)

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/keras/src/backend/tensorflow/trainer.py:113, in TensorFlowTrainer._make_function.<locals>.one_step_on_data(data)
@tf.autograph.experimental.do_not_convert
def one_step_on_data(data):
   """Runs a single training step on a batch of data."""
--> 113     outputs = self.distribute_strategy.run(step_function, args=(data,))
   outputs = reduce_per_replica(
       outputs,
       self.distribute_strategy,
       reduction="auto",
   )
   return outputs

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/distribute/distribute_lib.py:1673, in StrategyBase.run(***failed resolving arguments***)
with self.scope():
 # tf.distribute supports Eager functions, so AutoGraph should not be
 # applied when the caller is also in Eager mode.
 fn = autograph.tf_convert(
     fn, autograph_ctx.control_status_ctx(), convert_by_default=False)
-> 1673   return self._extended.call_for_each_replica(fn, args=args, kwargs=kwargs)

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/distribute/distribute_lib.py:3263, in StrategyExtendedV1.call_for_each_replica(self, fn, args, kwargs)
 kwargs = {}
with self._container_strategy().scope():
-> 3263   return self._call_for_each_replica(fn, args, kwargs)

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/distribute/distribute_lib.py:4061, in _DefaultDistributionExtended._call_for_each_replica(self, fn, args, kwargs)
def _call_for_each_replica(self, fn, args, kwargs):
 with ReplicaContext(self._container_strategy(), replica_id_in_sync_group=0):
-> 4061     return fn(*args, **kwargs)

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/autograph/impl/api.py:643, in do_not_convert.<locals>.wrapper(*args, **kwargs)
def wrapper(*args, **kwargs):
 with ag_ctx.ControlStatusCtx(status=ag_ctx.Status.DISABLED):
--> 643     return func(*args, **kwargs)

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/keras/src/backend/tensorflow/trainer.py:92, in TensorFlowTrainer.test_step(self, data)
else:
   y_pred = self(x)
---> 92 loss = self._compute_loss(
   x=x, y=y, y_pred=y_pred, sample_weight=sample_weight, training=False
)
self._loss_tracker.update_state(
   loss_module.unscale_loss_for_distribution(loss),
   sample_weight=tf.shape(tree.flatten(x)[0])[0],
)
return self.compute_metrics(x, y, y_pred, sample_weight=sample_weight)

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/keras/src/trainers/trainer.py:383, in Trainer._compute_loss(self, x, y, y_pred, sample_weight, training)
"""Backwards compatibility wrapper for `compute_loss`.

This should be used instead `compute_loss` within `train_step` and
`test_step` to support overrides of `compute_loss` that may not have
the `training` argument, as this argument was added in Keras 3.3.
"""
if self._compute_loss_has_training_arg:
--> 383     return self.compute_loss(
       x, y, y_pred, sample_weight, training=training
   )
else:
   return self.compute_loss(x, y, y_pred, sample_weight)

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/keras/src/trainers/trainer.py:351, in Trainer.compute_loss(***failed resolving arguments***)
losses = []
if self._compile_loss is not None:
--> 351     loss = self._compile_loss(y, y_pred, sample_weight)
   if loss is not None:
       losses.append(loss)

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/keras/src/trainers/compile_utils.py:690, in CompileLoss.__call__(self, y_true, y_pred, sample_weight)
def __call__(self, y_true, y_pred, sample_weight=None):
   with ops.name_scope(self.name):
--> 690         return self.call(y_true, y_pred, sample_weight)

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/keras/src/trainers/compile_utils.py:699, in CompileLoss.call(self, y_true, y_pred, sample_weight)
   self.build(y_true, y_pred)
_, loss_fn, loss_weight, _ = self._flat_losses[0]
loss_value = ops.cast(
--> 699     loss_fn(y_true, y_pred, sample_weight), dtype=self.dtype
)
if loss_weight is not None:
   loss_value = ops.multiply(loss_value, loss_weight)

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/keras/src/losses/loss.py:79, in Loss.__call__(self, y_true, y_pred, sample_weight)
else:
   mask = None
---> 79 return reduce_weighted_values(
   losses,
   sample_weight=sample_weight,
   mask=mask,
   reduction=self.reduction,
   dtype=self.dtype,
)

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/keras/src/losses/loss.py:193, in reduce_weighted_values(values, sample_weight, mask, reduction, dtype)
   values = values * sample_weight
# Apply reduction function to the individual weighted losses.
--> 193 loss = reduce_values(values, sample_weight, reduction)
return loss

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/keras/src/losses/loss.py:153, in reduce_values(values, sample_weight, reduction)
   divisor = ops.cast(ops.sum(sample_weight), loss.dtype)
else:
--> 153     divisor = ops.cast(
       ops.prod(
           ops.convert_to_tensor(ops.shape(values), dtype="int32")
       ),
       loss.dtype,
   )
loss = ops.divide_no_nan(loss, divisor)
loss = scale_loss_for_distribution(loss)

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/keras/src/ops/core.py:803, in cast(x, dtype)
if any_symbolic_tensors((x,)):
   return Cast(dtype=dtype)(x)
--> 803 return backend.core.cast(x, dtype)

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/keras/src/backend/tensorflow/core.py:204, in cast(x, dtype)
   return x
else:
--> 204     return tf.cast(x, dtype=dtype)

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/util/traceback_utils.py:150, in filter_traceback.<locals>.error_handler(*args, **kwargs)
filtered_tb = None
try:
--> 150   return fn(*args, **kwargs)
except Exception as e:
 filtered_tb = _process_traceback_frames(e.__traceback__)

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/util/dispatch.py:1260, in add_dispatch_support.<locals>.decorator.<locals>.op_dispatch_handler(*args, **kwargs)
# Fallback dispatch system (dispatch v1):
try:
-> 1260   return dispatch_target(*args, **kwargs)
except (TypeError, ValueError):
 # Note: convert_to_eager_tensor currently raises a ValueError, not a
 # TypeError, when given unexpected types.  So we need to catch both.
 result = dispatch(op_dispatch_handler, args, kwargs)

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/ops/math_ops.py:1021, in cast(x, dtype, name)
   logging.warn(
       f"You are casting an input of type {x.dtype.name} to an "
       f"incompatible dtype {base_type.name}.  This will "
       "discard the imaginary part and may not be what you "
       "intended."
   )
 if x.dtype != base_type:
-> 1021     x = gen_math_ops.cast(x, base_type, name=name)
return x

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/ops/gen_math_ops.py:2130, in cast(x, DstT, Truncate, name)
 Truncate = False
Truncate = _execute.make_bool(Truncate, "Truncate")
-> 2130 _, _, _op, _outputs = _op_def_library._apply_op_helper(
     "Cast", x=x, DstT=DstT, Truncate=Truncate, name=name)
_result = _outputs[:]
if _execute.must_record_gradient():

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/framework/op_def_library.py:755, in _apply_op_helper(op_type_name, name, **keywords)
def _apply_op_helper(op_type_name, name=None, **keywords):  # pylint: disable=invalid-name
 """Implementation of apply_op that returns output_structure, op."""
--> 755   op_def, g, producer = _GetOpDef(op_type_name, keywords)
 name = name if name else op_type_name
 attrs, attr_protos = {}, {}

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/framework/op_def_library.py:735, in _GetOpDef(op_type_name, keywords)
try:
 # Need to flatten all the arguments into a list.
 # pylint: disable=protected-access
 g = ops._get_graph_from_inputs(_Flatten(keywords.values()))
--> 735   producer = g.graph_def_versions.producer
 # pylint: enable=protected-access
except AssertionError as e:

File /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tensorflow/python/framework/ops.py:2271, in Graph.graph_def_versions(self)
 if self._finalized:
   raise RuntimeError("Graph is finalized and cannot be modified.")
-> 2271 @property
def graph_def_versions(self) -> versions_pb2.VersionDef:
 # pylint: disable=line-too-long
 """The GraphDef version information of this graph.

 For details on the meaning of each version, see
   (...)   2280     A `VersionDef`.
 """
 return versions_pb2.VersionDef.FromString(self._version_def)

KeyboardInterrupt: 

A CNN is great at spotting local phrases like “not good,” but it quickly forgets where it found them; sentiment often hinges on relations far apart in the text, so we need a model that can keep track of information across the whole sequence.

A Conv1D with kernel size $k=7$ can only look at $7$ consecutive tokens at a time.
Stacking layers enlarges the receptive field only linearly (first layer sees 7 tokens, two layers see $\approx 19$, etc.).
Important cues in a review are often dozens of tokens apart (“I hoped it would be great…but it is not.”).

On the IMDB dataset, a CNN typically reaches $~0.88$ accuracy, identical to the bag‑of‑words MLP, then plateaus. Extra filters or layers add parameters but do not solve the fundamental distance problem.

Encoding Text

Contents

Encoding Text#

Word Embeddings#

Using a CNN#